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Abstract. This paper is a review of recent mathematical and computational 
advances in optical tomography. We discuss the physical foundations of forward models 
for light propagation on microscopic, mesoscopic and macroscopic scales. We also 
consider direct and numerical approaches to the inverse problems which arise at each 
of these scales. Finally, we outline future directions and open problems in the field. 
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1. Introduction 

l> 

; Optical tomography is a biomedical imaging modality that uses scattered light as a probe 
of structural variations in the optical properties of tissue. In a typical experiment, a 
highly-scattering medium is illuminated by a narrow collimated beam and the light 
which propagates through the medium is collected by an array of detectors. There are 
many variants of this basic scenario. For instance, the source may be pulsed or time- 
harmonic, coherent or incoherent, and the illumination may be spatially structured or 
multispectral. Likewise, the detector may be time- or frequency-resolved, polarization 
or phase sensitive, located in the near- or far-field and so on. The inverse problem 
that is considered is to reconstruct the optical properties of the medium from boundary 
measurements. The mathematical formulation of the corresponding forward problem 
is dictated primarily by spatial scale, ranging from the Maxwell equations at the 
microscale, to the radiative transport equation at the mesoscale, and to diffusion theory 
at the macroscale. In addition, experimental time scales vary from the femtosecond 
on which light pulses are generated, through the nanosecond on which diffuse waves 
propogate, to the millisecond scale on which biological activation takes place and still 
longer for pathophysiologic changes. In this paper, we review the mathematical structure 
and computational approaches to the forward and inverse problems which arise at each 
of these scales. 
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The state of progress on the inverse problem of optical tomography was reviewed a 
decade ago in this journal [1]. Since then, the field has grown out of all proportion. It 
is almost impossible to comprehensively summarize the numerous new measurement 
systems, applications, algorithms, physical approximations and theoretical results. 
Nevertheless, in this paper we attempt to give an overview of those advances that have 
proven to be most signicant, and to enumerate those topics which remain unresolved 
and which represent the most fruitful areas for forthcoming research. 

We have structured the paper as a tutorial, emphasizing aspects of wave 
propagation in random media, the mathematical structure of related inverse problems, 
and computational tool for image reconstruction. The selection of topics is to some 
extent personal and reflects the tastes and biases of the authors. We do not attempt 
to summarize the considerable body of work that concerns instrumentation or clinical 
applications of optical tomography. For some reviews of these aspects we refer the 
reader to several articles [2-8]. Nevertheless, it is important to note that there have 
been significant advances on the experimental front that have motivated and impacted 
on the corresponding theoretical developments. 

• The introduction of noncontact imaging systems wherein a scanned beam and a 
lens-coupled CCD is employed to replace the illumination and detection fiber-optics 
of conventional optical tomography experiments [9-11]. Such systems generate 
data sets that are orders of magnitude larger than those acquired with fiber-based 
systems, leading to significant computational challenges for image reconstruction. 

• The development of continuously tunable high-power pulsed light sources has led 
to increased attention to the multispectral aspects of optical tomography. 

• The substantial growth in the availability of targeted fluorescent probes for 
molecular imaging has stimulated the development of computationally efficient 
algorithms for fluorescence optical tomography. 

In addition, from a theoretical point of view, the rapid progress in the analysis of optical 
phenomena at smaller scales has allowed the field to move from simply considering diffuse 
transport as the model of light propogation to include more general radiative transport, 
coherence and polarization effects. 

The paper is organized as follows. In section 2 we introduce some general concepts 
and an abstract framework within which it is possible to describe the class of inverse 
problems that arise in optical tomography. In section 3 we present the essential physical 
ideas that are need to the propagation of light in a random medium. We also describe 
the scattering theory of diffuse waves within radiative transport theory. In section 
4 we define the various imaging modalities that we will study using the framework 
established in section 2. In section 5 we introduce numerical methods for computing 
the Green's functions for the RTE and diffusion equation in the context of forward 
modeling. In section 6 we describe direct reconstruction methods and associated fast 
algorithms for several inverse problems in optical tomography. Such methods are well 
suited to reconstructing images from the large data sets that are available in noncontact 
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optical tomography systems. In section 7 we focus on numerical inversion methods using 
the tools of optimization theory and Bayesian statistics. In section 8 we introduce the 
basic concepts of shape-based methods for image reconstruction. In section 9 we survey 
some remaining topics including reconstruction of anisotropic and time-varying optical 
parameters. 

There are a number of topics that we do not discuss. These include thermoacoustic 
and photoacoustic tomography, acousto-optic imaging, coherent imaging and nanoscale 
optical tomography. Lack of space or expertise, rather than interest, prohibit the authors 
from treating these important subjects. 

2. A General Framework 

In this section we introduce some general concepts and notation for the forward and 
inverse problems occuring in optical tomography. Since the various models are described 
in terms of different physical quantities (space, time or frequency, direction, wavelength), 
we attempt some abstract definitions in terms of a generalised coordinate that we denote 



2.1. Forward problems 

We consider a domain f2 with boundary dVl. To allow for generalised coordinates 
we define the generalised domains S with boundary <9S. Experiments are performed 
by applying inward directed photon currents J~(£ s ),£ s G <9S and measuring outgoing 
photon currents J + (£ m ),£m G dE. In the "non-contact" application, these functions 
are applied and/or measured by free-space propogation from <9H to an exterior surface 
E ext . The propagation of light inside the domain is governed by a differential operator 
C(x) parameterised by functions x(r),r G f2. Let X denote the function space of these 
parameters. In most cases the inverse problem is a parameter estimation problem for x. 

Let § represent a (possibly infinite) set of source functions on dE or E ext . Let Q 
represent the space spanned by the functions in §. Then for a particular applied source 
current J~ G § the measureable current is found by solving 



where B~, B + are the boundary conditions. We assume that in all cases considered 
herein, these boundary conditions are specific enough to ensure uniqueness of U{. The 
measureable currents belong to a space Z. 

In the case where J~ is a 5-function located at £ s the solution Ui defines the Green's 
function G(£,£ s ), and we define the linear Green's operator 



bye 



C{x)Ui = 
B~Ui =Jr 



(2.1) 





U i = g(x)Jr = (G(x),Jr) gB = [ dCU-J, iU<K 



s 



(2.4) 



J (9H 



Optical tomography: forward and inverse problems 4 

where d£ s is the surface integration measure on <9H. 

The definitions (2.1)-(2.3) allow the definition of the mappings 

Transfer function : A : Q -> Z , J+ = A(x) Jr = B + G(x)Jr (2.5) 
Forward Map : F : X -> Z , J+ = F^x) (2.6) 



The (non-linear) forward map is specified for a particular source pattern. Let 
F s = {Fi, J~ G §} be the (possibly infinite) set of all forward mappings for the set of 
source currents S. 

Measured data is obtained as the result of a measurement operator 

M:Z^Y (2.7) 

where Y is a (possibly infinite) vector space. Combining the forward map and 
measurement operator defines a forward operator 

T = MF :X^Y (2.8) 

Let M represent a (possibly infinite) set of aperture functions on <9S or E ext . Then a 
particular datum is obtained by applying a given aperture Wj G M to the measurable 

yj . = M 3 Jt = J+) 9H = ( Wj , F t (x)) m = ( Wj , B + g(x)Jr) m (2.9) 

For the particular case of a 5-function source we have 

y lMs) = (w 1 ,B + G(U,Q) dE (2.10) 
Similarly, if the measurements are considered simply point samples we can consider 

ys(u)M,) = '3 + G(U,Q (2.11) 

2.2. Linearisation 

The key tool for the study of the inverse problem is the linearisation of the forward 
map F. Consider a perturbation in the parameters x{r) — > x{r) + x s (r). The Frechet 
derivative of the forward mapping is the linear mapping defined by 

F-(x)x 5 = Fi{x + x s ) - Fi(x) + o(\\x 5 \\ 2 ) (2.12) 

Let Fg = {Fi, J~ G §} be the (possibly infinite) set of all Frechet derivatives for the set 
of source currents §. 

When the Frechet derivative is projected onto an aperture function we define the 
change in measurement (for a particular perturbation x s ) 



y' jti = ^,F;(x)x 5 ) ds = (f:*(x)w v x 5 ) s 



(2.13) 



Optical tomography: forward and inverse problems 



5 



which serves as the definition of the adjoint Frechet derivative. The adjoint Frechet 
derivative will involve the solution of the adjoint problem 

C*(x)U* = (2.14) 
B + *U* = Wj (2.15) 

It is convenient to define another notation for the Frechet derivative as a general 
linear integral operator 

F[x s = IC lX 5 = Ki (U OAr) ^ (2.16) 

with kernel Ki and d£ the volume integral measure on S. The notation K§ is used for 
the complete set of these linear operators. 

^From the Frechet derivative of the forward mapping we are lead to a definition of 
the Frechet derivative of the Green's operator 

JC lX 5 = B + g'[J- ® x 5 ] = ((B + G', x s ) E , J-) m (2.17) 

The operator Q' has kernel G"(£ m ,£,£ s ) whose form depends on the particular problem 
considered. We may define in general terms a potential operator 

V(x s ) = C(x + x s ) - C(x) (2.18) 

which leads to 

Q' = -QVQ, and g'* = -Q*VQ (2.19) 

These definitions will be made more precise in section 6. 

For the particular case of 5-function source and sampling measurements Wj = 5(£ m ) 
the change in measurement given by (2.13) is simply the inner product of G' with the 
perturbation 

y'siumts) = (B + G'(UZ,ax S (r)) s = (p,x\ (2.20) 

which defines the sensitivity function or measurement density function p(r). More 
generally, the photon measurement density functions are the projection of the three- 
point Green's function onto a given source and measurement aperture 

Pj,i(r)= f w,(U) [ «s)G'(£ m ,£,£s)d£ s d£ m (2.21) 

For the case of finite source and aperture functions, the forward mapping F§ and all 
its Frechet derivatives are continuous-to-discrete, and the adjoint mappings are discrete- 
to-continuous; such mappings necessarily contain an infinite dimensional null-space. 
When considering an infinite set of source and aperture functions, the forward mapping 
does not contain a null-space but is bounded and compact, leading to an inverse problem 
that is unbounded and ill-posed. 

For computational purposes, the parameters of the inverse problem are considered 
in a basis representation 

x(r) = J2 x Mr) (2.22) 
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This allows the definition of a discrete-to-continuous linear mapping 

F' iX 5 = JC t x 5 = J2 x if K ^ Z)h(r) d£ (2.23) 
k ^ s 

Considering a finite set of measurement functions of dimension and a finite set of 
aperture functions of dimension nj leads to a vector of measurements of size n« x rij. 
Taking the basis representation (2.22) leads to a discrete matrix representation of the 
linearised problem 

y 6 = Ax s (2.24) 

where A has matrix elements 

pj,i{r)b k (r) dr = 

[ ^(6n) / J~(Q / G'(U^,Qh(r)d^d^dU (2.25) 

This matrix is typically referred to as the Jacobian, weight matrix or sensitivity matrix 
for the linearised inverse problem. 
The adjoint operation 

x 5 = AV (2.26) 
can be considered as the discretisation of the adjoint operator 

x = ^ V = K*M V = / K l (U,OM*y(r m )dU (2.27) 

J as 

= g'*B+*[Jr®y] = ^G'*,B + *y)^,J-)^ (2.28) 
2.3. Inverse problems 

Our goal is to invert the forward map F and recover the parameters x. To proceed, we 
note that the adjoint Frechet derivative operator defined in (2.27) acts on an element 
in data space to give an element in parameter space representing a change in x that 
reduces the norm of the residual difference between the data and the forward operator 
acting on x. In terms of optimisation theory this direction is the steepest descent update 
direction for the inverse problem in the sense that it gives the direction in which this 
residual norm decreases locally most rapidly. This operator is therefore the basis for 
gradient based optimisation methods. We explore this idea in more detail in section 7.1. 

Solution methods for inverse problems typically perform better if making used of 
second derivative information. The second Frechet derivative is given by 

Fi(x)(x{, x s 2 ) = F t {x + x{ + x 5 2 ) - Flix)^ + x 5 2 ) - F(x) (2.29) 

with corresponding second derivative Green's operator Q" and kernel G", given by 



g" = gvgvg 



(2.30) 



Optical tomography: forward and inverse problems 



7 




Figure 1. Second order derivative operators. On the left is the kernel of K,*K, In the 
centre the term g*VQVg. On the right the term g*VG*VQ. 

and in general we can define the nonlinear mapping F as 

F(x + x s ) = B + [Q{x) - G(x)V(x s )g(x) + • • • + (-l) n G(x) (V(x s )Q(x)) n + • • •] (2.31) 

which is the Born series. We discuss this series in more detail in (3.9) and (3.86) in 
section 3.5. 

For optimisation methods the second Frechet derivative gives rise to a Hessian 
operator 

H = K*K - T"*y & (2.32) 
where the adjoint operator J 7 "* is composed of two terms 

g"* = g*vgvg + g*vg*vg (2.33) 

The second derivative Green's kernels are represented in figure 1. 

Taking the basis representation as in (2.24) leads to a discrete Hessian 
representation 

H = A T A - F"V (2.34) 
which forms the basis of the Newton update scheme 

Hx 5 = AV (2.35) 

This is discussed in further detail in section 7.3. 

3. Waves, Transport and Diffusion 

The mathematical description of light propagation in random media changes according 
to the length scale of interest [12]. We begin with the Maxwell equations, which are 
valid on microscopic scales. The mesoscale, in which the characteristic scale is set by the 
scattering length, is described by the radiative transport equation (RTE). Finally, we 
discuss the macroscale, which is described by the diffusion approximation to the RTE. 
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3.1. Electromagnetic waves 

The Maxwell equations together with the Lorentz force law govern the classical 
description of all electromagnetic phenomena. For simplicity, we restrict our attention 
to nonmagnetic media and to monochromatic fields. We employ the convention that 
physical quantities are given by the real part of the corresponding complex quantities. 
In the absence of sources, the electric field E in an inhomogeneous medium with a 
position-dependent permittivity e satisfies the time-independent wave equation 

V x V x E(r) - kle{r)E{r) = , (3.1) 

where k is the free-space wavenumber. In addition, the divergence condition 

V • e(r)E(r) = (3.2) 

must be obeyed. If e varies slowly on the scale of the wavelength, then E satisfies the 
scalar wave equation 

V 2 U{r) + k 2 e(r)U(r) = , (3.3) 

where U is any of the components of E. Note that the components of E are still coupled 
according to (3.2). If we restrict our attention to (3.3) we will say that we are working 
within the scalar theory of electromagnetic fields. 

The conservation of energy is governed by the relation 

V- J + l!^l m ( e ) / = o , (3.4) 

c 

where the energy current density J is defined by 

J — — ^— (U*VU - UVU*) (3.5) 
and the intensity / is given by 

J=f|f/| 2 . (3.6) 

Note that J plays the role of the Poynting vector in the scalar theory. 

The solution to (3.3), which obeys the Sommerfeld radiation condition, is given by 
the Lippmann-Schwinger equation 

U(r) = U t (r) + k 2 Q J dr'G(r, r')U(r') V (r>) , (3.7) 

where 77 = (e — l)/4n is the dielectric susceptibility. Here Ui is incident field, which 
obeys (3.3) with 77 = and G is the Green's function which, in free space, is given by 
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If we iterate (3.7) starting from U = C/j, we obtain an infinite series for U of the form 
U(r) = U t (r)+k 2 J dr'Giry^ir^Uiir') 

+k 4 Q J dr'dr"G(r, r')r)(r')G(r' , r")r](r")Ui(r") + • • • . (3.9) 

Eq. (3.9) is known as the Born series and each of its terms corresponds to successively 
higher orders of scattering of the incident field. If only the first term in the series is 
retained, this is known as the Born approximation. 

3.2. Radiative transport 

In radiative transport theory, the propagation of light through a material medium is 
formulated in terms of a conservation law that accounts for gains and losses of photons 
due to scattering and absorption [13,14]. The fundamental quantity of interest is the 
specific intensity J(r,s, t), defined as the intensity at the position r in the direction s 
at time t. The specific intensity obeys the radiative transport equation (RTE): 

-— + s-VI+{fi a + fi s )I = fi s lp{s',s)I{r,s')ds', reQ, (3.10) 

where /i a and /i s are the absorption and scattering coefficients. The specie intensity also 
satises the half-range boundary condition 

I(r, s) = 7 inc (r, s) , s • < , r e dQ , (3.11) 

where i> is the outward unit normal to dfl and 7i nc is the incident specific intensity at 
the boundary. The above choice of boundary condition guarantees the uniqueness of 
solutions to the RTE [14]. The phase function p is symmetric with respect to interchange 
of its arguments and obeys the normalization condition 

p(s, §')<*§' = 1 , (3.12) 

for all s. We will often assume that p(s, §') depends only upon the angle between § and 
§', which holds for scattering by spherically-symmetric particles. Note that the choice 
p = l/(47r) corresponds to isotropic scattering. For scattering that is strongly peaked 
in the forward direction (s • s' ~ 1), an asymptotic expansion of the right hand side 
of (3.10) may be performed [15]. This leads to the Fokker-Planck form of the RTE 

1 c)I X 

-— + § • VI + (fi a + n 8 )I = fi a LI , L = -(1 - <?)A g , (3.13) 

where A§ is the Laplacian on the two-dimensional unit sphere S* 2 and g w 1 is the 
anisotropy of scattering, as given by (3.72) in section 3.5. A rational approximation to 
the transport term in which the operator L is given by 

L = aA§(l - feA§) -1 (3.14) 
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has also been proposed [15,16]. Here the constants a, b can be computed from angular 
moments of the phase function. In the case of the Henyey-Greenstein phase function 

1 oo 

P(s-s') = -j2( 2l+1 ^ lp ^^') , (3-15) 

1=0 

it can be seen that 

a =1(1 -,)(! + »), »=^T^- (3-16) 

Note that the form of L in (3.14) is a bounded operator on L 2 (S 2 ), which is not the 
case for the spherical Laplacian that appears in (3.13). 

The total power P passing through a surface £ is related to the specific intensity 

by 

P = J dr J dsl(r,s,t)s-v . (3.17) 

The energy density $ is obtained by integrating out the angular dependence of the 
specific intensity: 

$(r,t) = ~ c J I(r,s,t)ds . (3.18) 

We note that the RTE allows for the addition of intensities. As a result, it cannot 
explain certain wavelike phenomena. 



3.2.1. From waves to transport The RTE can be derived by considering the high- 
frequency asymptotics of wave propagation in a random medium. We briefly recall the 
main ideas in the context of monochromatic scalar waves. The general theory for vector 
electromagnetic waves is presented in [17]. We assume that the random medium is 
statistically homogeneous and that the susceptibility i] is a Gaussian random field such 
that 

fa(r)> = 0, (T,(r)r,(r>)) = C(\r - r>\) , (3.19) 

where C is the two-point correlation function and (• • • ) denotes statistical averaging. Let 
L denote the propagation distance of the wave. At high frequencies, L is large compared 
to the wavelength and we introduce a small parameter e = l/(koL) <C 1. We suppose 
that the fluctuations in rj are weak so that C is of the order 0(e). We then rescale the 
spatial variable according to r — > r/e and define the scaled field U e (r) = U(r/e), so 
that (3.3) becomes 

e 2 V 2 U e (r) + U e (r) = -Atv^V (r/e) U e (r) . (3.20) 

Here we have introduced a rescaling of rj to be consistent with the assumption that the 
fluctuations are of strength 0(e). 

Although (3.4) gives some indication of how the intensity of the field is distributed 
in space, it does not prescribe how the intensity propagates. To overcome this difficulty, 
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we introduce the Wigner distribution W € (r, k), which is a function of the position r and 
the wave vector k: 

W £ (r, k) = J dRe lk - R U e (r - ^eRj U* (r + ^eRj . (3.21) 

The Wigner distribution has several important properties. It is real-valued and is related 
to the intensity and energy current density by the formulas 

• = tJ^y w -^- J -Sw? kw ' {r ' k) - (3 ' 22) 

Making use of (3.20), it can be seen that the Wigner distribution obeys the equation 

k ■ V r W e + i-^L J dqe-^/^iq) (w e (r, k + l -q) - W e (r, k - l -q)^j = , (3.23) 

where we have assumed that rj is real-valued and fj denotes the Fourier transform of r\ 
which is defined by 

77(g) = f dre iqr r]{r) . (3.24) 



We now consider the asymptotics of the Wigner function in the homogenization 
limit e — > 0. This corresponds to the regime of high-frequencies and weak fluctuations. 
We proceed by introducing a two-scale expansion for W e of the form 

W e (r, r', k) = W {r, k) + yftW^r, r', k) + eW 2 (r, r', k) + ■ ■ ■ , (3.25) 

where r' = r/e is a fast variable. By averaging over the fluctuations on the fast scale, 
it is possible to show that (Wo), which we denote by W , obeys the equation 

k ■ V r W = J dk'C(k - k')5(k 2 - k' 2 ) (W(r, k') - W(r, k)) . (3.26) 

Evidently, (3.26) has the form of a time-independent transport equation. The role of 
the delta function is to conserve momentum, making it possible to view W as a function 
of position and the direction k/\k\. We note that the phase function and scattering 
coefficient are related to statistical properties of the random medium, as reflected in 
the appearance of the correlation function C in (3.26). If the medium is composed of 
spatially uncorrelated point particles with number density p, then 

dC s I / n nrt 

Ha = pa a , fx 3 = pa s , p= -j^/os , (3.27) 

where a a and a s are the absorption and scattering cross sections of the particles and 
da s /dfl is the differential scattering cross section. Note that a a , a s and p are wavelength 
dependent quantities. 

The above derivation of the RTE can be modified when the speed of light is not 
constant. Following Ref. [18], the statistically-averaged Wigner function can be seen to 
obey the equation 

nfc-V r W-A; Vn-(/-s®s))V g W = J dk'C{k-k')5{k 2 -k' 2 ) (W(r,k') - W(r,k)) , 

(3.28) 
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where n is the index of refraction. Using the relation / = c(ko/n(r)) 3 W and the above 
result, we obtain the required generalization of the RTE. 

Radiative transport theory can be generalized to account for energy transport for 
different states of polarization [17,19]. The RTE is then formulated in terms of the 
Stokes vector I = I±,U,V). Here the Stokes parameters, which characterize the 
polarization of the field, are defined as 

J,, = {EwE;) , I ± = (E ± E* ± ) , (3.29) 

U = (E\\E* ± + E\E\ , V = i({E\\E* ± - E^E ± )) , (3.30) 

where En , E± denote two orthogonal components of the electric field in the plane 
perpendicular to the direction of propagation. Note that the total intensity / = I\\ + I±. 
The Stokes vector satisfies the generalized RTE 

- — + s • VI + (ji a + fi s )I = n a J M{s', s) • I(r, s')ds' , (3.31) 

where M denotes the Mueller matrix, which accounts for mixing of different states of 
polarization upon scattering. 

3.3. Collision expansion 

The RTE (3.10), obeying the boundary condition (3.11), is equivalent to the integral 
equation 

I(r, s) = I (r, s) + J G (r, s; r', sV(r>(s', s")I(r', s")dr'ds'ds" . (3.32) 

Here Iq is the unscattered (ballistic) specific intensity, which satisfies the equation 

[s • V + fx a + fi s ] J = , (3.33) 
and G is the ballistic Green's function 

G (r, s; r', s') = g(r, r')5 (s> - y^j^ ^ ~ §') , (3-34) 



where 

1 r \r-r'\ / r _ r l \ 

(3.35) 



g(r,r')= 1 exp 



\r — r' 



\r-r'\ . r - r 

fit[r' + £j^—^)d£ 



and the extinction coefficient fit = fi a + f^s- Note that if a narrow collimated beam of 
intensity 7i nc is incident on the medium at the point r in the direction s , then Io(r, s) 
is given by 

J (r, s) = 7incG (r, s; r , s ) , (3.36) 

To derive the collision expansion, we iterate (3.32) starting from = I and 
obtain 

J(r,s) =/ (0) (r,s) + /( 1 )(r,s) + /( 2 )(r, §) + ••• , (3.37) 
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where each term of the series is given by 

j(»)(r, s) = J dr'ds'ds"G (r, s; r', s> 8 (r>(s', s")/ (n-1) (» ,/ , s") , (3.38) 

with n = 1, 2, The above series is the analog of the Born series for the RTE, since 

each term accounts for successively higher orders of scattering. 

It is instructive to examine the expression for which is the contribution to the 
specific intensity from single scattering: 

/^(r, s) = J dr'ds'ds"G (r, s; r', s> s (r')p(s', s")I (r', s") . (3.39) 

It follows from (3.36) that the change in intensity 51 = I — 1^ measured by a point 
detector at r 2 in the direction s 2 , due to a unit-amplitude point source at r*i in the 
direction §i is given by 

/■OO 

<J/(ri, §i; r 2 , s 2 ) = p(§i, s 2 ) / dRR 2 g(r 2 , n + Rs^g^ + Rsi, n) 

./o 

x 5 _ \ + ^ g 

\F2 - Ti - ifei| J 

Suppose that r*i and r 2 are located on the boundary of a bounded domain and §i points 
into and s 2 points out of the domain. Then the rays in the directions §i and s 2 must 
intersect at a point R that lies in the interior of the domain. In addition, the delta 
function in (3.40) implements the constraint that §i, s 2 and t*i — r 2 all lie in the same 
plane. Using this fact and carrying out the integration in (3.40), we find that 

"Li pL2 



57(ri,si;r 2 ,s 2 ) oc exp 



/x t (ri + es^de - / fi t (R + m 2 )di 

ii Jo 



(3.41) 



where L\ = \R — r±\, L 2 = \R — r 2 | and we have omitted overall geometric prefactors. 
Note that the argument of the exponential corresponds to the integral of /j, t along the 
broken ray which begins at r 1: passes through R, and terminates at r 2 . The significance 
of such broken rays will be discussed in section 6.3. 

The terms in the collision expansion can be classified by their smoothness. The 
lowest order term is the most singular. In the absence of scattering, according to (3.35), 
this term leads to a Radon transform relationship between the absorption coefficient and 
the specific intensity, under that condition that the source and detector are collinear. 
Inversion of the Radon transform is the basis for optical projection tomography [20,21]. 
The first order term is also singular, as is evident from the presence of a delta function 
in (3.40). Terms of higher order are of increasing smoothness. This observation has 
been exploited to prove uniqueness of the inverse transport problem and to study its 
stability. A comprehensive review is presented in [22]. 

The above discussion has implicitly assumed that the angular dependence of the 
specific intensity is measurable. In practice, such measurements are extremely difficult 
to obtain. The experimentally measurable intensity is often an angular average of the 
specific intensity over the aperture of an optical system. The effect of averaging is to 
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remove the singularities that are present in the specific intensity. The resulting inverse 
problem is then highly ill-posed [23]. To illustrate this point, we observe that it follows 
from (3.18) and (3.32) that the energy density <3> obeys the integral equation 

1 r -fi t \r-r'\ 

$(r) = $ (0) (r) + — J dr' p(r')$(r') , (3.42) 

where the scattering is assumed to be isotropic and 

$(°)( r ) = I J /(°)( r ,s)ds . (3.43) 

Here the kernel appearing in (3.42) is smoothing with a Fourier transform that decays 
algebraically at high frequencies. 

3.4- Transport regime 

We now develop the scattering theory for the RTE in an inhomogeneously absorbing 
medium. We proceed by decomposing /i a into a constant part p, a and a spatially varying 
part 8fi a : 

Va{r) =fi a + Sfi a {r) . (3.44) 
The stationary form of the RTE (3.10) can be rewritten in the form 

s-VI + fkl-Vsj P(§', s) J(r, s')ds' = -8fi a (r)I , (3.45) 

where \x t = p, a + ji s . The solution to (3.45) is given by 

J(r, s) = Ii(r, s)-J dr'ds'G(r, s; r', s')SpL a (r')I(r' \ §') , (3.46) 

where G denotes the Green's function for a homogeneous medium with absorption p, a and 
Ii is the incident specific intensity. Eq. (3.46) is the analog of the Lippmann-Schwinger 
equation for the RTE. It describes the "multiple scattering" of the incident specific 
intensity from inhomogeneities in 8/i a . If only one absorption event is considered, then 
the intensity / on the right hand side of (3.46) can be replaced by the incident intensity 
Jj. This result describes the linearization of the integral equation (3.46) with respect to 
S/j, a . If the incident field is generated by a point source at 7*1 pointing in the direction 
§i, then the change in specific intensity due to spatial fluctuations in absorption 51 can 
be obtained from the relation 

<J/(ri,si;r 2 ,s 2 ) = h J drdsG(r 1 ,s 1 ;r,s)G(r,s;r 2 ,s 2 )SiJ, a (r). (3.47) 



Here Iq denotes the intensity of the source and r 2 , s 2 are the position and orientation of 
a point detector. 
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To make further progress requires knowledge of the Green's function for the RTE. 
The Green's function for an infinite medium with isotropic scattering can be obtained 
explicitly [14]. In three dimensions we have 

H s f dk i k . (r _ r >\ 1 



G(r, *•■',«') = G„(r,«; + £ / (|^e*<'-"> 



(/i t + is ■ k) (jit + is' ■ k) 



x p_ , (3.48) 

'-ftf-'G?) 

where Go is the ballistic Green's function which is defined by (3.34). More generally, 
numerical procedures such as the discrete ordinate method or the approximation may 
be employed to compute the Green's function in a bounded domain with anisotropic 
scattering, as described in section 5. 

The method of rotated reference frames is a spectral method for the computing 
the Green's function for the three-dimensional RTE in a homogeneous medium with 
anisotropic scattering and planar boundaries [24] . It is derived by considering the plane- 
wave modes for the RTE which are of the form 

I(r,s)=A k (s)e k \ (3.49) 

where the amplitude A is to be determined. Evidently, the components of the wave 
vector k cannot be purely real; otherwise the above modes would have exponential 
growth in the k direction. We thus consider evanescent modes with 



k = iq± vV + 1/A 2 z , (3.50) 

where q ■ z = and k k = 1/A 2 . These modes are oscillatory in the transverse direction 
and decay in the indirections. By inserting (3.49) into the RTE (3.10), we find that 
A k (s) satisfies the equation 

(s • k + n a + n s ) A k (s) = ii s J p(s, s')A k (s')ds' . (3.51) 

To solve the eigenproblem defined by (3.51) it will prove useful to expand A k (s) into 
a basis of spherical functions defined in a rotated reference frame whose z-axis coincides 
with the direction k. We denote such functions by Yi m (s; k) and define them via the 
relation 

i 

Y lm (s;k) = D l mm ,( v ,6,0)Y lm ,(s) , (3.52) 

m'=-l 

where Yi m (s) are the spherical harmonics defined in the laboratory frame, D l mm , is the 
Wigner D-function and </?, 9 are the polar angles of k in the laboratory frame. We thus 
expand A k as 

A k {s) = Y,Ci m Y lm {s ] k) , (3.53) 

l,m 
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where the coefficients C\ m are to be determined. Note that since the phase function 
p(s, §') is invariant under simultaneous rotation of s and s', it may be expanded into 
rotated spherical functions according to 

p(s, §') = J>>U§; fc)Y^(s'; fe) , (3.54) 



where the expansion coefficients pi are independent of k. An alternative approach that 
may be used to solve the eigenproblem (3.51) is to employ the method of discrete 
ordinates [25-27]. 

Substituting (3.53) into (3.51) and making use of the orthogonality properties of 
the spherical functions, we find that the coefficients Ci rn satisfy the equation 

^ Ryln'Cl'm' — ^ClClm ■ (3.55) 
l',m' 

Here the matrix R is defined by 

RlZ* = Jdss- kY lm (s; fe)V ro ,(s; k) (3.56) 



— 0~mm' (blm^l',1-1 + h+l,rrM' ,l-l) , 



where 
and 



b lm = v /(/2_ m 2 )/(4/2 _ 1) ; (3 . 57) 



o x = /jL a + /jL s (l - pi) . (3.58) 

Eq. (3.55) defines a generalized eigenproblem which can be transformed into a standard 
eigenproblem as follows. Define the diagonal matrix 5^,™, = ^mm'^w \fo T i- Note that 
o\ > since pi < 1 and thus S is well defined. We then pre and post multiply R by 
S* -1 and find that Wip = \ip where W = S^^^RS^ 1 and = SC. It can be shown that 
W is symmetric and block tridiagonal with both a discrete and continuous spectrum 
of eigenvalues A M and a corresponding complete orthonormal set of eigenvectors ip^, 
indexed by /j, [24]. We thus see that the modes (3.49), which are labeled by /i, the 
transverse wave vector q, and the direction of decay, are of the form 

§ ) = E E A=r^LD l mm'{^ 0, 0)Y lm ,(s)e^^> , (3.59) 



l,m m 1 



where 

QM = + 1/A= • (3.60) 

The Green's function for the RTE in the z > half-space may be constructed as a 
superposition of the above modes: 

G(r,s;r',s') = J j^y 2 ^A^^r, s)lT q ,(r', -s) , (3.61) 
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where the upper sign is chosen if z > z', the lower sign is chosen if z < z' and the 
coefficients A qil are found from the boundary conditions. Using this result, we see that 
G can be written as the plane-wave decomposition 

G(r, s; r', §') = / 7^ E ^.{z^q^^Y^)^^) > (3-62) 

J ^ > lm,l'm' 



where 



Zf (*, *\ 9) = "4=? E E A *ML> (3-63) 



st 

'° lG l n M,M' 



x D l mM (ip, 9, 0)Di, M ,(ip, 9, 0) e -^*-*'\ 
^Bfafarie-Q*™-^ , (3.64) 



2lm 



which defines B^,. It is important to note that the dependence of G on the coordinates 
r, r' and directions s, s' is explicit and that the expansion is computable for any 
rotationally invariant phase function. 

3.5. Diffuse light 

The diffusion approximation (DA) to the RTE is widely used in applications. It is valid 
in the regime where the scattering length l s — l//x s is small compared to the distance of 
propagation. The standard approach to the DA is through the P/v approximation, 
in which the angular dependence of the specific intensity is expanded in spherical 
harmonics. The DA is obtained if the expansion is truncated at first order. The DA 
may also be derived using asymptotic methods [28]. The advantage of this approach is 
that it leads to error estimates and treats the problem of boundary conditions for the 
resulting diffusion equation in a natural way. 

The DA holds when the scattering coefficient is large, the absorption coefficient 
is small, the point of observation is far from the boundary of the medium and the 
time-scale is sufficiently long. Accordingly, we perform the rescaling 

H a -> en a , n s -> -Us , t -> t/e , (3.65) 
where e < 1. Thus the RTE (3.10) becomes 



~cdi 



+ es • V/ + e 2 fi a I + n s I = iJL s J p(s, s')I(r, s')ds' . 
We then introduce the asymptotic expansion for the specific intensity 



(3.66) 



I(r, s) = I (r, s) + eh(r, s) + e 2 / 2 (r, s) + • 



(3.67) 
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which we substitute into (3.66). Upon collecting terms of 0(1), 0(e) and 0(e 2 ) we have 
p(s, s')I (r, s')ds' = I (r, s) , (3.68) 

V/ + = ^ J p(§, s')/i(r, s')ds' , (3.69) 

+ s • V/i + At a / + A* sh =iis p(§, s')/ 2 (r, s')ds' . (3.70) 



s 

1% 

c ft 

The normalization condition (3.12) forces 1$ to depend only upon the spatial coordinate 
r. If the phase function p(s, §') depends only upon the quantity s • §', it can be seen that 

/ 1 (r,s) = - r ^-s-V/ (r) , (3.71) 

where the anisotropy g is given by 

g = J V s'p(s • s')ds' , (3.72) 

with — 1 < g < 1. Note that g — corresponds to isotropic scattering and g> = 1 to 
extreme forward scattering. If we insert the above expression for I x into (3.70) and 
integrate over s, we obtain the diffusion equation for the energy density $: 

JU( r , t) = V • [£>(r) V$(r, t)] - c^(r) M (r, t) , (3.73) 

where Jo = c$/(47r). Here the diffusion coefficient is defined by 

D=\ct, t = 1 , (3.74) 

where £* is known as the transport mean free path. The usual expression for t obtained 
from the Pn method 

t = ^ , (3.75) 

is asymptotically equivalent to (3.74) since /i a = e 2 fi s . 

The above derivation of the DA holds in an infinite medium and at long times. In 
a bounded domain, it is necessary to account for both boundary and initial layers, since 
the boundary conditions for the diffusion equation and the RTE are not compatible [28]. 
For the remainder of this paper, we will make use of the Robin boundary condition for 
a bounded domain Q, which is of the form 

:= c$ + 2(D v ■ V$ = J~ on dQ , (3.76) 

where v is the outward unit normal and the extrapolation length £ ext = 2(D/c can be 
computed from radiative transport theory [14]. We note that ( = corresponds to an 
absorbing boundary and ( — > oo to a reflecting boundary. In the case of a boundary 
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between diffusing media of differing refractive index n and n\ then ( and £ ext take the 
form 

C=£i = (3-") 
where i? is the unpolarized Fresnel reflection coefficient. We also define the outgoing 
current 

:= J + = —D v ■ V$ = — (c$ - J~) (3.78) 

If the phase function is not rotationally invariant, then the diffusion coefficient 
in (3.73) becomes a second rank tensor, whose elements are given by angular moments 
of p(s, s'). The diffusion process is then anisotropic. We return to anisotropic diffusion 
briefly in section 9.1. We also note that if the speed of light is not constant, then the 
above derivation of the DA can be modified beginning from (3.28) [18]. The resulting 
diffusion equation depends explicitly upon the index of refraction n and is of the form 



!*(r,t) = V 



D(r) V (>/ 2 (r)<I»ir./)) 



Cf i a (r)$(r,t) . (3.79) 



n 2 (r) 

We now consider the scattering theory of time-harmonic diffuse waves. Assuming 
an e tuJt time-dependence with modulation frequency cu, the energy density obeys the 
equation 

-V • [£>(r)V$(r)] + (cfi a (r) + iw)$(r) = , (3.80) 

In addition to (3.80), the energy density must satisfy the boundary condition (3.76). 
The solution to (3.80) obeys the Lippmann-Schwinger equation 

$ = - GVQ (3.81) 

where $j is the energy density of the incident diffuse wave and G is the Green's function 
for a homogeneous medium with absorption Jx a and diffusion coefficient D . We have 
also made use of the specific form of the generalised potential operator introduced in 
(2.18), which for the diffusion equation is given by 

V = c5fx a - V • (SDV) , (3.82) 

where 5/i a = /i a — p, a and SD = D — D . The unperturbed Green's function G(r, r') 
satisfies 

(y 2 -k 2 )G{r,r') = -— 5(r-r>) , (3.83) 

where the diffuse wave number k is given by 

= . (3.84) 

Do 

We note here that the fundamental solution to the diffusion equation is given by 

1 p -k\r-r'\ 

G(r, r') = -r—-, r . (3.85) 

v ' ; AnD \r -r'\ y J 

By iterating (3.81) beginning with $ = $j, we obtain (compare to (2.31)) 

$ = $. _ GV$i + GVGVQi + • • • , (3.86) 

which is the analog of the Born series for diffuse waves. 
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4. Optical Tomography Modalities 

We may now define some of the different problems of optical tomography 
4-1- Optical Tomography Based on the Diffusion Equation 

In diffuse optical tomography the space X consists of pairs of functions x = (/x a , D) and 
the governing PDE C is the diffusion equation. We consider the domain as a simply 
connected set in R" 

E = n, dE = dn (4.1) 

The source functions J~ and their resultant outgoing currents J + belong to the space 
H^ 1 / 2 (dVL). For the time- dependent case the PDE C is given by (3.73) and in the 
frequency domain case by (3.80). In both cases $ nust satisfy the Robin boundary 
condition B~ given by (3.76) and Neumann boundary condition B + given by (3.78), 
therefore the mapping 

A : H- 1/2 (dn) -> H- 1/2 (dn) (4.2) 
is the Robin-to-Neumann map 

4-1.1. Diffuse Optical Tomography (DOT) The non-linear inverse problem in DOT is 
to recover functions /i a G X^,D 6 X fl from measurements y G Y. The data may be 
collected either in the time domain or in the frequency domain either at one or several 
modulation frequencies u [3]. If the former is used, it is typical to Fourier Transform 
the data and to develop the analysis in the frequcny domain. At frequency u = ('DO 
measurements), the data simply represents a total photon count without any phase 
information, and the recovery of both /x a and D suffers from non-uniqueness [29] (but 
see [30]). In this case a simpler problem is commonly defined in which one parameter 
(usually scattering) is assumed known and the forward mapping is restricted to recovery 
of the second function. 

For the DOT problem the Frechet derivative (2.12) has a kernel based on the density 
functions for absorption and diffusion 



[ Pus^u) 


)"( 


\ Po{r-u) f 




( P^(r,t) \ 


"( 


\ Po(r,t) i 





$*(r; cu)$(r; to) 
V$*(r;cu) • V$(r; u 



(4.3) 



f^*(r,-t'Mr,t-t>)dt> \ 
J 'vr(r,-0 • V$(r,t-t')dt' ! 



where we note that the adjoint problem leads to a backward-time equation. 

Although DOT is a non-linear inverse problem, a linearised version is often 
considered, which we refer to as Difference Diffuse Optical Tomography (DDOT). Here 
we take differences in measurements y s = y2 — yi and formulate a linear mapping 
as in (2.24). Clearly the linear mapping A is given by the linear Frechet derivative 
operator, evaluated at x\ = (fj, ai ,Di). As in nonlinear DOT, the simplified case of 
absorption only imaging is frequently considered. The success of this approach depends 
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on how sensitive the reconstruction is to the correct choice of linearisation point x\. 
This typically involves careful calibration and consideration of the modelling errors. We 
return to this point in section 7.4. 

4-1-2. Fluorescence Diffuse Optical Tomography (FDOT) In fluorescence optical 
tomography sources are introduced at an excitation wavelength A e giving rise to an 
excitation field $ e . Fluorescence is regarded as a function h(r) which governs the 
absorption of radiation at wavelength A e and (partial) re-emission as a source at the 
longer wavelength (lower energy) Measurements are taken at A^ and A e . Using the 
frequency domain formulation, we consider two coupled PDEs 

C(x e )<S> e = - V ■ J D e (r)V$ e (r;cj) + (c^(r) + iw)$ e (r;cj) = (4.5) 

C(x f )$ e ^ f = - V • D f (r)V$ e ^ f (r; u) + (cfi{(r) + iu) $ e ^(r; u) = 

h(r-u)<$> e (r-uj) (4.6) 

= c <r(r m ; u ) + 2(D e (r m )D ■ V$ e (r m ; u) = J e -(r s ; u) (4.7) 

=c^ f (r m ;uj) + 2CD f (r m )i)-V^ f (r m ;uj) = (4.8) 

B+& = r+(r m , u) = -D e (r m )u ■ V$ e (r m ; u) (4.9) 

B+^f = J e ^+(r m ; u) = -D f (r m )u ■ V<£ e ^(r m ; u) (4.10) 

The quantity h is a product of the concentration of fluorescent material and its 
conversion efficiency Since re-emission is a Poisson process it is characterised by a 
lifetime r. In frequency domain this leads to a complex valued parameter 

h ^ = ^rrt(rj < 4 - u > 

The simplest problem considered is to recover h G from measurements y G Y e_> ^ 
and the linear operator 

T e C S ■ X ft -> Y e ^ (4.12) 

Consider the frequency domain version, the kernel of the Frechet derivative (2.12) in 
this case is the density function with respect to h given by 

p h (r-u) = ^*{r-u J )<$> e (r-uj) (4.13) 

giving a complex valued reconstruction from which ho, r can be recovered using (4.11). 
Clearly, for DC measurements, only the fluorescence h can be recovered, not the 
lifetime; Fluorescence Lifetime Imaging Tomography (FLIM tomography) is a term used 
to emphasise that both h and r are being recovered. Time domain measurements may 
of course be used to solve this problem. If the data are Fourier transformed into the 
frequency domain the techniques for image reconstruction are unchanged. It may also 
be expressed directly in the time domain. In this case the kernel is given by a double 
convolution 

p h (r;t) = f f $ f *(r;-t")$ e (r;t-t')h(r,t" -t')dt'dt" (4.14) 
Jo Jo 
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where 

h(r, t" - t') = /i (r)e~^r (4.15) 

gives explicitly the exponential decay of the Poisson process. In practice the half-life 
does not have a discrete value but a distribution, and the linear integral equation with 
(4.14) as its kernel has to also be convolved with the finite time spread function of the 
measurement system [31]. 

4-1.3. Diffuse Optical Tomography with Fluorescence (DOT-FDOT) The discussion of 
the linear FDOT problem in section 4.1.2 is predictated on knowing the parameters 
x e and used to construct the kernel in (4.13). These could themselves be recovered 
using uncoupled DOT reconstructions at the excitation and fluorescence wavelengths. 
Therefore it is natural to consider the joint problem (using the notation of (4.5)-(4.10)) 



£(x e )$ e = 
C(x f )¥ = 



2T$ e = J e ~ B + $ e = J e+ 
= jf- = jf+ 



with the joint forward operator 

\ 



( ?l 

H 



\ 



/ 



(4.16) 
(4.17) 
(4.18) 



(4.19) 



The solution of the joint problem is computationally more demanding than the three 
seperate problems. This problem has hardly been addressed in the frequency domain 
in [32] and in the time domain in 



4-1.4- Multispectral Diffuse Optical Tomography Whereas each of the above problems 
could be considered at a range of spectral samples and the spectral variation of 
the recovered solutions determined, the idea in multispectral DOT is to reformulate 
the problem into the recovery of a set of images of known chromophores with well 
characterised spectral dependence : 



= ^ q(Aj)q -> // a (A) = ec 



(4.20) 



where e is a known matrix. Similarly a spectral dependence of scattering can be written 

as 

/4(A) = a\- b (4.21) 

This leads to a formulation 

HcaM ■ X M} - Y A (4.22) 
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/ ?AAi) \ 

V y s M J 



/A Ma (A0 ... A^(A0 ... \ 
V ... A^(X N ) ... A,,(\ N ) J 



( /xi(Ai) \ 



V ^(Aat) / 
(4.23) 

The spectral component at each pixel is independent so we introduce the diagonal 
matrices 



C fc (A,) := c k (\j)\ B a (\j) := \j b \ B b (X n ) := a\j b \n Xj\ 

and we can now represent (4.23) as 
/ y 5 (Ai) \ 



(4.24) 



( A Ma (Ai) ... A M ,(A X ) 



V o 

/ Ci(Ai) . 
Ci(A n ) . 

Ci(Ajv) . 




V . 



... A^n) 
Cjc(Ai) 

c K (x n ) 



C K (X 
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A^(Aat) / 
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x 



B (Ax) B b (Ax) 
B a (A n ) B 6 (A n ) 
B a (X N ) B 6 (Ax) / 



V b 5 / 



(4.25) 



See [34,35] 

Similarly to MSDOT in Multispectral fluorescence DOT (MSFDOT) the 
fluorescence may be considered a linear combination of fluorophores emitting radiation 
in a known spectral pattern 



M A j) =^2e i {X j )p i -> /i(A) = e / p 



In this case a linear forward operator may be constructed 



(4.26) 



(4.27) 



See [36]. Finally, a very general problem could be considered in the recovery of all 
chromophores and flurophores from the operator 
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x r ) ( y-- ) (428 > 

4-1.5. Bioluminescence (BDOT) Bioluminescence optical tomography detects the light 
emitted from within tissues through the action of luciferase on its substrate, luciferin. 
The luciferase emission spectrum is quite wide [37] (500-800 nm) and overlaps with so- 
called near infrared window of biological tissues (650-850 nm), where the light scattering 
dominates over the light absorption. Red and near infrared components of the emission 
spectrum penetrate biological tissue appreciably well allowing to image bioluminescence 
objects imbedded deeply inside tissue. 

In terms of the general framework established in section 2 the bioluminescence 
tomography forward problem is posed as 

C(x)$ = q (4.29) 
ZT$ = (4.30) 
J+ = (4.31) 

There are no incoming radiation sources and spontaneously created light is subject to 
the same boundary conditions as in all other problems considered in this paper. In the 
inverse problem, the unknown function q is to be reconstructed from measured values 
of function J + on the surface. This is a linear inverse problem which can be formulated 
in terms of a Fredholm integral equation of the first kind 

J + (U) = j[ G(6n,0?(0d& (4-32) 

In [38] the bioluminescent source distribution was recovered from a monochromatic 
data set and theoretical acpects of uniqueness of solutions were discussed in [39,40]. 
The idea of using multiple wavelengths in bioluminescence imaging was introduced by 
Chaudhari et al [41] in which this approach is applied for 3D localization of deep sources 
within a mouse phantomm vivo. Combination of bioluminescence and Positron Emission 
Tomography (PET) is discussed in [42]. 

4-2. Optical Tomography based on the Radiative Transfer Equation 

Each of the problems in section 4.1 could be considered based on the RTE instead of 
the diffusion equation. The parameters considered in the inverse problem are the same 
but the measurements are different. There are two possibilities : angularly resolved 
measurements or angularly independent measurements. In the latter case the size of the 
measured dataset is the same as in the diffusion problems, but the construction of the 
linearisation is different. In the case of angularly dependent data the data set is much 
bigger. Issues regarding uniqueness depend critically on which measurements are made. 

In the RTE-based problems, the generalised domain becomes S = Q x S 71 ^ 1 with 
boundary d~ = B_ U B + where B± = dQ x S± _1 (i/)- The 

governing PDE C is given 
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by (3.10) with parameters // a ,A* s an d the boundary condition B~ specifies the specific 
intensity on 5_ as given by (3.11). The transfer function (2.5) is known as the Albedo 
Operator 

A:L 1 (B_)^L 1 (B + ) (4.33) 
which maps directional incoming radiation on dfl to directional outgoing radiation 

J £i'( r m, s)\s-u>o = ^J~i>(r s , s)\ s . o<0 (4.34) 

where the set of source functions 8 is indexed by position index % and direction index %' . 

As in the diffusion case, the specific problems may be time-dependent, time- 
independent, or frequency domain. The role of the aperture function is crucial. In 
angular resolved meaurements 

Vj,j',i,i' = ( w 3,fi J i') d ^ 

= / w f (s) / w j (r m )J+ i ,(r m ,s)dr m ds (4.35) 
Js"- 1 Jan 

The sensitivity functions are given by 

p Ma (r)= f I*(r,s)I(r,s)ds (4.36) 

p Ms (r) = / I*(r,s)I(r,s)ds- / p(s ■ s')I*(r, a) ds'I(r, s) ds (4.37) 

where I*(r,s) is the solution to the adjoint transport equation with source boundary 
condition Wj/(s)wj(r m ) on B + 

Note that the equivalent right-hand side for the fluorescence problem in (4.6) is 
assumed isotropic. I.e. 

qf(r) = h{r) [ I e (r,s)ds (4.38) 

For detailed discussion of the uniqueness and stability of the different problems 
in inverse transport arising from angularly resolved vs angularly averaged and time- 
dependent vs time- independent measurements we refer to the recent review article [22]. 

5. Forward Modelling Methods 

The inversion methods described in the sequel are dependent on the accuracy and 
efficiency with which Green's functions can be computed. The analytic form of such 
functions for the DA are known for several problems and geometries. The format for the 
RTE is generally limited to one-dimensional and other special cases (see section 3.4). 

5.1. Volume Discretisation Methods 

In the Finite Element Method (FEM) the volume Vt is discretised to a mesh 

fi^{T n ,Nn,Un} (5.1) 
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where Tq is the set of P elements A e ; e = 1, . . ., is the set of N vertices N p ;p = 
1, . . . , N, and is the set of N locally supported basis functions u p (r);p — 1, . . . , N. 
For the diffusion equation, writing 

$(r) ~ $ ft (r) = % u P ( r ) ( 5 - 2 ) 

v 

results to a discrete system 

K(x)# = q (5.3) 

where K(x) has matrix elements 

K im = / (D(r)Vui(r) ■ Vuj(r) + (c// a (r) + iu;) ^(r)u m (r*)) dr + 

— / w z (r)w m (r)dr (5.4) 

and q represents the discretisation of the boundary conditions. Making use of the basis 
expansion (2.22) allows the representation of K(x) as 

K = S + iuB + i D ^k + c^kK) (5.5) 

k 

where B is the mass matrix, S is the matrix of surface integrals with elements given by 
the last term on the right in (5.5) and = J^-, = are given by 

Kim= [ b k (r)V Ul (r).Vu m (r)dr (5.6) 

Km= \ bk(r) Ul (r)u m (r)dr (5.7) 
Jn 

In RTE FEM, a basis is also defined for the angular directions 

_^ V 5 „_ 1 = {T 5 „_! , Ngn-l , Ugn-l } (5.8) 

Different schemes result by choosing different basis on the unit circle S 1 or sphere S 2 

(i) The discrete Ordinate method chooses a discrete set: U 5 u = {5(s — s p } which 
are chosen to give exact quadrature points for a spherical harmonic expansion of 
the angular variable. The radiance is thus represented explicitly in a set of ray 
directions [43]. 

(ii) The Pn method chooses the spherical harmonics (rotated into real functions) 
directly: Usn = |y^,(s)|. This allows explicitly the representation as diffusion 
in the lowest order with higher terms representing higher order effects [44] 

(iii) The local basis function method chooses a set of locally supported basis functions 
on Tgn-i . This allows essentially the same machinary of sparse matrix manipulation 
as for the spatial variables [45-47]. 

(iv) The wavelet basis chooses a heirarchical set of wavelets on the sphere to allow for 
variable and adaptive angular discrisation 
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The RTE has been utilized as the forward model for optical tomography 
reconstruction in a few studies. In most of these papers, the forward solution of 
the RTE has been based either on the finite difference method (FDM) or the finite 
volume method (FVM) and the discrete ordinate formulation of the RTE, see [49-53]. 
In the FEM solution of the RTE in low-scattering or non-scattering regions, the ray- 
effect may become visible [54,55]. In [56], the FEM model of [46,47] was augmented 
with the streamline diffusion modification to stabilize the forward solution in this 
case. The streamline diffusion modification has been found to stabilise numerical 
solutions of the RTE in situations in which standard techniques produce oscillating 
results [45,56,57]. Due to the heavy computational and memory requirements of meshes, 
adaptive techniques have been used within an FVM approach [58,59], as well as the FEM 
scheme [60-64]. In order to overcome the difficulty of 3D meshing, meshless methods 
can be used [65]. 

5.2. Boundary Discretisation Methods 

Rather than meshing a volume we may consider the domain as the union of a number 
of sub domains 

Sl = \JtSle, £=1...L (5.9) 

together with constant with subdomain optical parameters {x e } and a set of interfaces 
between domains 

x j = dn t ,t>, j = i...J (5.io) 

We consider the diffusion approximation and introduce the following notation 

t/^^-ik^k, (5.11) 



dfi-i 



dv 



. (5.12) 



For simplicity we develop the discussion using only two regions. For a more general im- 
plementation we refer to [66]. For homogeneous region Qi,Q 2 , Green's second theorem 
provides the following 

- I (^g^W.) - Mr'j) d<, = Q lW (5.13) 

*>M + / s Ps^W-J " ^V->) dri = QAr) (5.14) 



where 



Qt(r)= [ G e (r,r')q e (r')dr' (5.15) 

and the fundamental solutions Gi are the three-dimensional Green's functions of the 
diffusion equation in an infinite medium as in (3.85). Taking the limit as r G fli — > dfl, 
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r G Q\ — > E, and r G H 2 -> H results in three coupled boundary integral equations 
(BIEs) in the three unknown functions 



{f} = {f/ 1 ,f/ 2 ,J 2 }. 
In contrast to (5.1) only the interfaces Sj are discretised to meshes 



(5.16) 



(5.17) 



where Tj is the set of P« surface elements Aj >e ; e = l,...,Pj, Nj is the set of iVj 
vertices Ni, p ;p = 1, . . . , iVj, and Uj is the set of iVj locally supported basis functions 
u ijP (r);p = 1, . . . , JVj. Then the functions (5.11)-(5.12) are represented as 



Ni Ni 

u i( r ) ~ X] U hP u hp( r )> J i( r ) ~ 5^ Ji,pUi,p(r) . 

p=l p=l 



(5.18) 



Using the representation (5.18) in the BIE system (5.13)-(5.14) followed by sampling 
at the nodal points, we obtain a discrete system for the collocation Boundary Element 
Method [67-71] as a linear matrix equation 



(5.19) 



where we assumed that the source term is only in Qi, and where Ui, J{ are the vectors 
of coefficients Ui tP , J^ p in (5.18), and the matrix elements are given by 







-A 1 


B 12 \ 


/ 








/ 




A 2 4- J-R 2 




l-A? 2 














QiIe 


V 




l + A^ 2 


~ B 22 / 


V 


J 2 






V 


/ 



A? lx {p,it) 
A{ 2 (p,p') 
Ai 2 (p,p') 

bL(p,p') 

B{ 2 (p,p f ) 
B J 22 (p,p') 



dn dv 
dv 
dv 

Gi(N hPl r' m ) 

D l 
D 2 



u 2 ,p'(r'J dr' m 
U2, P '(r' m ) dr' m 
ui, P '(r' m ) dr' m 



U2, P '(r'J dr' m 
U2,p'(r'J dr' m 



The extra function £f in (5.19), arises due to singularities on the boundary. These 
terms can be calculated by surrounding the point r m , which lays on the boundary, 
by a small hemisphere cr £ of radius e and taking the limit when e — > 0. However, as 
shown in [68,72], this term can be obtained indirectly by utilising some simple physical 
considerations. In particular, we have ^ + ( r ) = C r ( r ) = \ w hen the observation point 
lies on a smooth surface, which is the case considered here. 
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Note that appropriate numerical techniques for handling the singularity of the 
kernels is required for the diagonal elements of these matrices [66]. We represent (5.19) 

as 

Jf = Q, (5.20) 

where / = [Ui, U2, V2] is the discrete version of {f} in (5.16). The matrix T takes the 
form of a dense un-symmetric block matrix. 

The normal flux is not always required. In this case we may take the Schur 
complement of (5.19) to obtain 

/ \\ + Ah + £Bh -A} 2 + B\ 2 (B^)- 1 (I| + Al 2 ) \f U, \ ( Q x \ m \ 
\ k\, + ^ II - k\ 2 + B? 2 (B^)' 1 (I| + A y )[U 2 ) \Q 1 \v ) 

(5.21) 

This is a smaller system than (5.19) and therefore computationally cheaper. However, 
it is still usually lengthy to solve large dense matrix systems. Another approximation is 
obtained if write (5.21) in the form 

i(l-2G)£/ = Q, (5.22) 

We can then use a Neumann series approximation 

\u ~ (l + 2G + 4G 2 + ...)Q (5.23) 

truncation of this series at the n th power in G constitutes the Kirchoff approximation 
[73,74]. 

5.3. Monte Carlo Methods 

Monte Carlo methods are highly prevelant in many models of radiation in tissue. Within 
optical tomography, one of the early cited works include Prahl et al [75] dating from 1989 
and Wang et al [76] describing multi-layered tissues. For the more complex geometries 
involved in modelling light propogation in the head, [77, 78] describe 3D voxel based 
MC models including anisotropy [79]. 

Use of the MC model within image reconstruction is usually limited to the 
construction of a linear perturbation model for reconstructing the difference in optical 
properties from changes in the data [78,80] with [81] describing the use of MC to recover 
optical absoprtion changes in a layered model 

Because of the heavy computational cost of Monte Carlo methids, GPU based fast 
MC was investigated in [82]. Another way to 'parallelize' computation of MC solutions 
with different optical parameters (/i' s and // a ) is to use white Monte Carlo. Absorption 
can always be scaled, but the problem of photon termination remains (if absorption is 
not known when photon packets are followed). In a homogeneous semi-infininite model 
also scattering can be scaled, but this is not applicable to heterogeneous models [83]. 
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5.4- Hybrid Methods 

5.4-1- Coupled Radiative Transport and Diffusion To overcome the limitations of 
diffusion theory in the proximity of the light sources, hybrid methods which combine 
Monte Carlo simulation with diffusion theory have been reported [84,85] and have been 
applied to media with low-scattering and non-scattering regions [86] . This approach still 
suffers from the time consuming nature of the Monte Carlo methods and often require 
iterative mapping between the models which increases computation times even more. 

The Fokker-Planck equation, which has been found to describe light propagation 
accurately when the scattering is strongly peaked in the forward direction, can be utilized 
at the small depths below highly collimated light sources [87]. However, it does not 
describe light propagation accurately at greater depths in biological tissues nor within 
low-scattering or non-scattering regions. 

A coupled radiative transfer equation and diffusion approximation model was used 
in [88] for media which contained strongly absorbing and low-scattering regions and 
in [46,47] to describe light propagation in turbid medium containing highly collimated 
light sources. In [56] this model was extended to describe light propagation in turbid 
medium with low-scattering and non-scattering regions. In the coupled approach, the 
RTE is used as a forward model in sub-domains in which the assumptions of the DA 
are not valid. This includes the regions in the proximity of the source and boundary 
and the low-scattering and non-scattering regions. The DA is used as a forward model 
elsewhere in the domain. The RTE and DA are coupled through boundary conditions 
between the RTE and DA sub-domains and they are solved simultaneously using the 
FEM. 

5-4-2. The Void Problem The void problem refers to the difficulty of modelling a non- 
scattering region within a highly scattering one. This has particular relevence in optical 
tomography applied to the brain where the presence of the cerebral spinal fluid (CSF) 
has just such a property [89]. 

The effect of void regions on particle transport was identified in early studies 
of the Boltzmann equation for the propagation of neutrons in [54, 55, 90, 91]. In 
applications of optical tomography the effect of the voids has been studied more 
recently in [50,56,88,92-96], and it has been found that under these circumstances, 
most numerical solutions to the transport equation, apart from Monte Carlo, run into 
difficulties so that special methods need to be developed. One proposed solution is the 
radio sity- diffusion model [92,94,97]. The principle here is to model propagation in voids 
through geometrical optics, assuming that the distribution of directions is given by a 
non-local boundary condition at a diffuse/non-diffuse interface 
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$(r) + 2(Du ■ V$(r) = ^ ^ cos 9 cos 0' ^$(r') - 2D Q + ^ ) ) * ' V$ ( r ' 



) x 



(l-Wtf)| 3 Wr,rO BIPH ^ i ;JJ r " r ' ll dr' (5.24) 
r, r e <9f2 cos 9 = 0(r) ■ -, cos 9' = 0(r') ■ - 



where dQ is the surface of the void region and h(r, r') is a visibility flag that is unity 
if r, r' are in line of sight across the void, and zero otherwise. See [98] for a detailed 
derivation. Both a FEM-radiosity [94] and BEM-radiosity [99] implementation have 
been presented. The radiosity-diffusion model was used in [100-103] where a boundary 
recovery method was also developed for the inverse problem. 

In [96, 104] a different approach was proposed in which the void was replaced by 
an interface condition with an anisotropic Laplace-Beltrami operator. The tangential 
diffusion induced by this anisotropy was related to the void width and curvature. 



6. Direct Inversion Methods 



By direct inversion we mean the use of inversion formulas and associated fast 
image reconstruction algorithms. In optical tomography, such formulas exist for 
particular experimental geometries, including those with planar, cylindrical and 
spherical boundaries. In this section, we consider several different inverse problems 
organized by length scale. We begin with the macroscopic case and proceed downward. 



6.1. Diffuse optical tomography 

As described in section 4, the aim is to reconstruct the absorption and diffusion 
coefficients of a macroscopic medium from boundary measurements. We first consider 
the linearized inverse problem in one dimension and then discuss direct methods for 
linear and nonlinear inversion in higher dimensions. 



6.1.1. One- dimensional problem We start by studying the time-dependent inverse 
problem in one dimension, which illustrates many features of the three-dimensional 
case. Let Vl be the half-line x > 0. The energy density $ obeys the one-dimensional 
version of the time- dependent diffusion equation (3.73) 

JU(x, t) = D-^$(x, t) - cfi a (x)$(x, t) , x e n , (6.1) 

where the diffusion coefficient D is assumed to be constant, an assumption that will be 
relaxed later. The energy density is taken to obey the initial and boundary conditions 

$(rc,0) =5(x-x 1 ) , (6.2) 
H0,t)-£ cxt — (0,t) = 0. (6.3) 
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Here the initial condition imposes the presence of a point source of unit-strength at X\. 
Since $ decays exponentially, we consider for k > the Laplace transform 



/•oo 

-k 2 Dt 



<f>(x,k) = / e~ k Dt $(x,t)dt , (6.4) 



lo 



which obeys the equation 



(Hb{ ' ' + k 2 (l + /,(./•))<!>(.,•) = ±-8(x -./•,) . (6.5) 



dx2 , v . , , /w ,^,- jD , 



where 77 is the spatially-varying part of the absorption, which is defined by 77 = 
cfi a /(Dk 2 ) — 1. The solution to the forward problem is given by the integral equation 

$(x) = - k 2 [ G{x,y)^{y)r 1 {y)dy , (6.6) 



n 



where the Green's function is of the form 



and $j is the incident field, which obeys (6.5) with 77 = 0. The above integral equation 
may be linearized with respect to r](x) by replacing u on the right-hand side by U{. This 
approximation is accurate when supp(r/) and rj are small. If we introduce the scattering 
data $ s = $j — $ and perform the above linearization, we obtain 

^ s (xi,x 2 ) = k 2 f G(x 1 ,y)G(y, x 2 )r ] (y)dy . (6.8) 
Jn 

Here $ s (xi,X2) is proportional to the change in intensity due to a point source at x\ 
that is measured by a detector at x 2 - 

In the backscattering geometry, the source and detector are placed at the origin 
(xi = x 2 = 0) and (6.8) becomes, upon using (6.7) and omitting overall constants 



00 

kx 



<S> s (k) = / e~ kx r](x)dx , (6.9) 
Jo 

where the dependence of $ s on k has been made explicit. Thus, the linearized inverse 
problem can be seen to correspond to inverting the Laplace transform of rj. Inversion of 
the Laplace transform is the paradigmatic exponentially ill-posed problem which can be 
analyzed as follows [105] . Eq. (6.9) defines an operator A : 77 1— > $ s which is bounded 
and self-adjoint on L 2 ([0, 00]). The singular functions / and g of A satisfy 

A*Af = a 2 f, AA*g = a 2 g, (6.10) 

where a is the corresponding singular value. In addition, / and g are related by 



Af = ag, A*g = af 



(6.11) 
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If we observe that A*A(x, y) = l/(x + y) and use the identity 

/ T— d y= • FTn ' -l<Rea <0, (6.12) 
J l + y sm(a + l)7r 



we see that 



and 



f s (x) = g* s (x) = -^=x~^ +is , seR (6.13) 
v 2ir 



a] = l ( . ~ e— '-I . (6.14) 

COSh(7TS) 

Note that the singular values of A are exponentially small, which gives rise to severe 
ill-posedness. Using the above, we can write an inversion formula for (6.7) in the form 

oik J dsR I -J f s (x)g* s (k)<t> s (k) , (6.15) 

where the regularizer R has been introduced to control the contribution of small singular 
values. 

6.1.2. Linearized inverse problem We now consider the linearized inverse problem in 
three dimensions. To indicate the parallels with the one-dimensional case, we will find it 
convenient to work in the half-space geometry. Extensions to other geometries, including 
those with planar, cylindrical and spherical boundaries is also possible [106-108]. In 
particular, we note that the slab geometry is often employed in optical mammography 
and small animal imaging. As before, we define the scattering data $ s = — $. 
Linearizing (3.81) with respect to 5/i a and 5D we find that, up to an overall constant, 
$ s obeys the integral equation 

r 2 ) = J dr[G(r 1 ,r)G(r,r 2 )c5ii a (r) + W r G(r 1 ,r)-W r G(r,r 2 )5D(r)} , (6.16) 

where r\ is the position of the source, r 2 is the position of the detector and we have 
integrated by parts to evaluate the action of the operator V. The Green's function in 
the half-space z > is given by the plane-wave decomposition 

G(r, r') = J j^y 2 9(q; z, z>) exp[iq • (p - p')\ , (6.17) 

where we have used the notation r = (p,z). If either r or r' lies in the plane z — 0, 
then 

9(9;Zi2 , ) = ^^Qfc1i > (618) 

A) Q{q)Kext + 1 

where 

Q(q) = vV + k 2 . (6.19) 

The inverse problem is to recover 5a and SD from boundary measurements. To proceed, 
we introduce the Fourier transform of $ s with respect to the source and detector 
coordinates according to 



$ s (<?i,<? 2 ) = J dpidp 2 e^ l Pl+92 ^$ s (p 1 ,0;p 2 ,0) . 



(6.20) 
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If we define 

V>(<h, q 2 ) = (Q(qi)4xt + l)(Q(<? 2 )4xt + 1)^.(91, q 2 ) (6.21) 
and make use of (6.17) and (6.16), we find that 

MQi, Q2) = J dre^ + ^e q 2 , z)c5fi a (r) + K D (q u q 2 , z)5D{r)\ . (6.22) 

Here 

ka(Qi, Q2, z) = cexp[- (Q(qi) + Q(q 2 )) z] , (6.23) 
K D (qu 9 2 ; z) = - (qi • q 2 + Q(qi)Q(q 2 )) exp[- (Q( 9 i) + Q(g 2 )) z] . (6.24) 
We now change variables according to 

qi = q + p/2, q 2 = q-p/2, (6.25) 

where q and p are independent two-dimensional vectors and rewrite (6.22) as 

ip(q+p/2,q-p/2) = J dr exp(-iq- p) [k a (p, q; z)S^i a (r) + K D (p, q\ z)5D(r)\ , (6.26) 

The above result has the structure of a Fourier-Laplace transform by which S/j, a and SD 
are related to ip. This relation can be used to obtain an inversion formula for the integral 
equation (6.22). To proceed, we note that the Fourier-transform in the transverse 
direction can be inverted separately from the Laplace transform in the longitudinal 
direction. We thus arrive at the result 

ip(q + p/2,q -p/2) = J K A (p, q; z)5fx a (q, z) + K D (p, q; z)SD(q, z) dz , (6.27) 

where 8/j, a and SD denote the two-dimensional Fourier-transform. For fixed q, Eq. (6.27) 
defines an integral equation for 8fj, a (q,z) and SD(q,z). It is readily seen that the 
minimum L 2 norm solution to (6.27) has the form 

~5fi a (q,z) = J dpdp , K A (p,q;z)M- 1 (p,p';q)^(p' + q/2,p' - q/2) , (6.28) 

~5D(q,z) = J dpdp>K* D (p,q;z)M~ 1 (p,p>;qW(p> + q/2,p' -q/2) , (6.29) 
where the matrix elements of M are given by the integral 

M(p,p';q)= f [K A (p,q;z)K* A (p',q;z) + K D (p,q;z)K* D (p',q;z)]dz (6.30) 
Jo 

Finally, we apply the inverse Fourier transform in the transverse direction to arrive at 
the inversion formula 

ty«(r) = J J^y e ~ iq ' P j dpdp>K A (p,q-z)M-\p,pi-q)^q' + p/2,q> -p/2) , (6.31) 
5D(r) = J ^Le-^J dpdp > K* D (p,q;z)M- 1 (p,p';q)^(q > + p/2,q' -p/2) . (6.32) 
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Several remarks on the above result are necessary. First, implementation of (6.31) 
and (6.32) requires regularization to stabilize the computation of the inverse of the 
matrix M. Second, sampling of the data function $ s is easily incorporated. The Fourier 
transform in (6.20) is replaced by a lattice Fourier transform. If the corresponding wave 
vectors q, q2 are restricted to the first Brillouin zone of the lattice, then the inversion 
formula (6.31) and (6.32) recovers a bandlimited approximation to the coefficients S/i a 
and SD [108, 109]. Third, the resolution of reconstructed images in the transverse 
and longitudinal directions is, in general, quite different. The transverse resolution is 
controlled by sampling and is determined by the highest spatial frequency that is present 
in the data. The longitudinal resolution is much lower due to the severe ill-posedness of 
the Laplace transform inversion which is implicit in (6.26), similar to the one-dimensional 
case discussed in section 6.1.1. Finally, the inversion formula (6.31) and (6.32) can be 
used to develop a fast image reconstruction algorithm whose computational complexity 
scales as 0(N M log M), where M is the number of detectors, N is the number of 
sources and M ^> N [108]. The algorithm has recently been tested in noncontact 
optical tomography experiments. Quantitative reconstructions of complex phantoms 
with millimeter-scale features located centimeters within a highly-scattering medium 
have been reported [110, 111]. Data sets of order 10 8 source-detector pairs can be 
reconstructed in approximately 1 minute of CPU time on a 1.5 GHz computer. 



6.1.3. Boundary removal The direct inversion method described in section 6.1.2 is 
applicable to relatively simple geometries, including those with planar, cylindrical 
and spherical boundaries. An extension of the method can be employed to handle 
more complex geometries [112]. Consider a bounded domain Q in which the diffusion 
equation (3.80) for the energy density $ holds. Making use of the Kirchoff integral as 
described in section 5.2, we obtain 



$(r) = %(r) - c [ G(r,r')5ii a (r')<S>(r')dr' 
Jn 

+ D I [G(r, r')V$(r') - $(r')V r/ G(r, r')} ■ vdr' , 
Jan 



(6.33) 



where, for simplicity, we have assumed that the diffusion coefficient is constant 
throughout Q. Here 

$ (r) = / G(r,r')J-{r')dr' (6.34) 
Jq 

and G is the fundamental solution to the diffusion equation defined in (3.85). Using 
the boundary condition (3.76) and introducing the approximation $ = $ in ^, we find 
that (6.33) becomes 



$ _ $ _ M )(r) = / G(r,r')^ (r')cSfji a (r')dr' , 



where 



M(r) 



do. 



—G{r,r') + v-V r ,G{ry) 

*-ext 



&(r')dr' . 



(6.35) 



(6.36) 
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We observe that the left hand side of (6.35) is known from measurements. Thus, within 
the accuracy of the aforementioned approximation, the inverse problem becomes that 
of recovering 5a in an infinite medium with measurements on the surface dQ. Let B 
denote a ball that contains Q. Then the field on dQ can be propagated to OB by using 
the Rayleigh-Sommerfeld formula 

$(r) = D f v ■ V r ,G(r, r')$(r')dr' , redB. (6.37) 
Jan 

Thus, it is possible to reformulate the inverse problem for a bounded domain in terms of 
one for an infinite domain, with measurements prescribed on a spherical surface. This 
latter problem is one for which direct inversion can be performed. 



6.1.4- Nonlinear inverse problem We now consider the nonlinear inverse problem of 
DOT. The Born series (3.86) can be rewritten in the form 

$ a (ri,r 2 ) = J drK l 1 (r 1 ,r 2 ;r)rj i (r) + J drdr'K t 2 j (r 1 ,r 2 ;r,r')ri i (r)ri j (r') + - ■ ■ , (6.38) 



where 



summation over repeated indices is implied with i,j = 1,2. The components of the 
operators K 1 and K 2 are given by 

K{(r u r 2 ; r) = G(n, r)G(r, r 2 ) , (6.40) 

K*(n, r 2 ; r) = V r G(n, r) ■ V r G(r, r 2 ) , (6.41) 

K\\r u r 2 ; r, r 1 ) = -G(r l} r)G(r, r')G(r>, r 2 ) , (6.42) 

K\\r u r 2 ; r, r') = -G(r u r)V r ,G(r, r') ■ V r ,G(r', r 2 ) , (6.43) 

Kl\r u r 2 ; r, r') = -V r G(r l} r) ■ V r G(r, r>)G(r>, r 2 ) , (6.44) 

K?( ri ,r 2 ;r,r') = -V r G(r l} r) • V r [V r .G(r, r') • V r ,G(r', r 2 )] . (6.45) 

It will prove useful to express the Born series as a formal power series in tensor powers 
of i] of the form 

$ s = Kir] + K 2 r] ®r] + K 3 i] (8) T) <8) r) H . (6.46) 

The solution to the nonlinear inverse problem of DOT may be formulated from the 
ansatz that r\ may be expressed as a series in tensor powers of $ s of the form 

7] = K-fis + /C 2 $ s ® $ s + /C 3 $ s ® $ s ® $ s + • • • , (6.47) 

where the /Q's are operators which are to be determined [113-116]. To proceed, we 
substitute the expression (6.46) for $ s into (6.47) and equate terms with the same 
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Tj 




Figure 2. Diagrammatic representation of the inverse scattering series. 



tensor power of 77. We thus obtain the relations 

K 2 K X ®K X + K X K 2 = , 

/C3X1 <8> ATi <8> ATi + /C 2 Xi ®K 2 + K 2 K 2 ®K ± + ldK 3 = , 

j-i 



m=l 



iiH hi m =j 



which may be solved for the /C/s with the result 
fC 2 = - K,iK 2 K,i ® fCi , 



i^i + /C1X3) /d (8) /Ci (8) /Ci , 
/Ci ® • • • ® /Ci . 



(6.48) 
(6.49) 
(6.50) 

(6.51) 



(6.52) 
(6.53) 
(6.54) 



(6.55) 



Kj = - ( Km Y K 

\m=l iiH him=J 

We will refer to (6.47) as the inverse series. Here we note several of its 
properties. First, is the regularized pseudoinverse of the operator K\. The 
singular value decomposition of the operator can be computed analytically for 
particular geometries, as explained in section 3.5. Since the operator /Ci is unbounded, 
regularization of is required to control the ill-posedness of the inverse problem. 
Second, the coefficients in the inverse series have a recursive structure. The operator Kj 
is determined by the coefficients of the Born series Ki, K 2 , . . . , Kj. Third, the inverse 
scattering series can be represented in diagrammatic form as shown in Figure 2. A solid 
line corresponds to a factor of G, a wavy line to the incident field, a solid vertex (•) to 
/Ci$ s , and the effect of the final application of /Ci is to join the ends of the diagrams. 
Note that the recursive structure of the series is evident in the diagrammatic expansion 
which is shown to third order. Finally, inversion of only the linear term in the Born series 
is required to compute the inverse series to all orders. Thus an ill-posed nonlinear inverse 
problem is reduced to an ill-posed linear inverse problem plus a well-posed nonlinear 
problem, namely the computation of the higher order terms in the series. 

We now characterize the convergence of the inverse series. We restrict our attention 
to the case of a uniformly scattering medium for which 77 = 5a. To proceed, we require 
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an estimate on the L 2 norm of the operator Kj. We define the constants ji and v by 

li = sup k 2 \\G (r, 0I|l3(b„) • (6-56) 

reB a 



v = k 2 \B a \ 1/2 sup ||G (r, -)||l W . (6.57) 

reB a 

Here _B a denotes a ball of radius a which contains the support of rj. It can be shown [116] 
that if (fj, p + f p )||/Ci||2 < 1 then the operator 

Kj : L 2 (dn x • • • x dQ) — ► L 2 (B a ) (6.58) 

defined by (6.55) is bounded and 

WlCj^KCfa + vYWldWi, (6.59) 

where C = C(/i p , u p , ||/Ci|| p ) is independent of j. 

We can now state the main result on the convergence of the inverse series. 

Theorem 6.1. [116]. The inverse scattering series converges in the L 2 norm if 
1 1 AZ^x 1 1 2 < l/(A i + z/ ) and ||/Ci$ s ||i2( Ba ) < l/(/i + z/). Furthermore, the following estimate 
for the series limit fj holds 

n r/.. , _. mi*- * n ~\ N + 1 



3=1 



. c [(Vp + Vp)\\fcl®s\\L2(B a )]' 



L p (B a ) 1- (/i + Z/)||/Ci$ s || L 2 (Ba) 

where C — C(/j,, u, || /Ci || 2) does not depend on N nor on the scattering data $ s . 

The stability of the limit of the inverse series under perturbations in the scattering 
data can be analyzed as follows: 

Theorem 6.2. [116]. Let || /Ci || 2 < l/(/ i + z/ ) an d let $ sl and $ s2 be scattering data for 
which M\\]Ci || 2 < l/(fj, + v), where M = max (||$ s i H2, 1 1 *T>s2 1 1 2 ) - Let r\\ and 772 denote the 
corresponding limits of the inverse scattering series. Then the following estimate holds 

\\Vl - mW^iBa) < C\\$ sl - < $>s2\\L2{dnxdQ.) , 

where C — C(fx, u, ||/Ci||2, M) is a constant that is otherwise independent o/$ s i and <3> S 2- 

It is a consequence of the proof of Theorem 6.2 that C is proportional to ||/Ci||2- 
Since regularization sets the scale of ||/Ci||2, it follows that the stability of the nonlinear 
inverse problem is controlled by the stability of the linear inverse problem. 

The limit of the inverse scattering series does not, in general, coincide with 77. We 
characterize the approximation error as follows. 

Theorem 6.3. [116] Suppose that ||/Ci|| 2 < l/{fi + u), ||/Ci$ s || L 2 (jBa ) < l/(/i + z/). Let 
M. = max (|M|L2( Ba ), ll/Ci-Ki^Hx^^)) and assume that M. < l/([/, + v). Then the norm 
of the difference between the partial sum of the inverse series and the true absorption 
obeys the estimate 



V 

3 



z,[(»P + Vp) II Kl II 2 11$ 



where C = C(/i, u, ||/Ci|| 2 , M) and C = C(/i, ^||/Ci||2) are independent of N and $ s . 
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We note that, as expected, the above result shows that regularization of /Ci 
creates an error in the reconstruction of rj. For a fixed regularization, the relation 
JC\Ki = I holds on a subspace of L 2 (B a ) which, in practice, is finite dimensional. By 
regularizing /C x more weakly, the subspace becomes larger, eventually approaching all 
of L 2 . However, in this instance, the estimate in Theorem 6.3 would not hold since 
||/Ci|| 2 is so large that the inverse scattering series would not converge. Nevertheless, 
Theorem 6.3 does describe what can be reconstructed exactly, namely those r] for which 
K\K\T) = I. That is, if we know apriori that rj belongs to a particular finite-dimensional 
subspace of L 2 , we can choose JC± to be a true inverse on this subspace. Then, if ||/Ci|| 2 
and ||/Ci$ s ||z,2 are sufficiently small, the inverse series will recover r] exactly. 

It is straightforward to compute the constants \i and v in dimension three. We have 

^ kh -^(^My\ (6 . 60) 

-2kdist(dQ,B a ) 

^^'^'' Vdist^r (6 ' 61) 

Note that v is exponentially small. It can be seen that the radius of convergence of the 
inverse series R — l/(fi + v) < 0(l/(ka) 3 ^ 2 ) when ka 3> 1. 

Numerical studies of the inverse scattering series have been performed. For 
inhomogeneities with radial symmetry, exact solutions to the forward problem were 
used as scattering data and reconstructions were computed to fifth order in the inverse 
series. It was found that the series appears to converge quite rapidly for low contrast 
objects. As the contrast is increased, the higher order terms systematically improve the 
reconstructions until, at sufficiently large contrast, the series diverges. 



6.2. Inverse transport 

We now turn our attention to the inverse problem for the RTE. This is a very large 
subject in its own right. Indeed, a recent topical review has already covered the 
key mathematical issues regarding existence, uniqueness and stability of the inverse 
transport problem [22]. Here we aim to discuss the inverse problem in some special 
cases which lead to direct inversion procedures and fast algorithms. See [117] for further 
details. As in section 6.1.2, we will work in the z > half-space with the source and 
detector located on the z = plane. The source is assumed to be pointlike and oriented 
in the inward normal direction. The light exiting the medium is further assumed to 
pass through a normally-oriented angularly-selective aperture which collects all photons 
with intensity 

J(r) = / u -sA(s)I(r,s)ds , (6.62) 

Ji>s>0 

where A accounts for the effect of the aperture and the integration is carried out over 
all outgoing directions. When the aperture selects only photons traveling in the normal 
direction, then A(s) = 5(s — v) and J(r) = 7(r, if). This case is relevant to noncontact 
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measurements in which the lens is focused at infinity. The case of complete angularly- 
averaged data corresponds to A(s) = 1. 

We now consider the linearized inverse problem. If the medium is inhomgeneously 
absorbing, it follows from (3.47) and (6.62) that the change in intensity measured relative 
to a homogeneous reference medium with absorption p, a is proportional to the data 
function $ s (p 1 ,p 2 ) which obeys the integral equation 

$ s (pi,p 2 )=/ i>.sA(s)$ a (pi,0,l;p2,0,-s)ds. (6.63) 

JOs>0 

Following the development in section 6.1.2, we consider the Fourier transform of 
with respect to the source and detector coordinates. Upon inserting the plane-wave 
decomposition for G given by (3.62) into (3.47) and carrying out the Fourier transform, 
we find that 

^(fli,*) = M mim 2 (^i>92) / drexp [i( qi + q 2 ) ■ p (6.64) 

Ml ,M2 

- (QuAqi) + Q^(Q2)z]Sii a (r) , 

where 

AW«i,<b) = E^S^ 1 '^ 1 -^ 2 '^ /. (6-65) 

7 it i j J iv-s>0 

Eq. (6.64) is a generalization of the Fourier-Laplace transform which holds for the 
diffusion approximation, as discussed in section 6.1.2. It can be seen that (6.64) reduces 
to the appropriate form in the diffuse limit, since only the smallest discrete eigenvalue 
contributes. 

The inverse problem now consists of recovering 5fi a from $ s . To proceed, we make 
use of the change of variables (6.25) and rewrite (6.64) as 

$ s (q + p/2,q-p/2) = J dzK(q,p; z)Spt a (q, z) , (6.66) 

where 5/i a (q,z) denotes the two-dimensional Fourier transform of 5/i a with respect to 
its transverse argument and 

K{q, p;z)= (q + P/2, q ~ p/2) (6.67) 

xex V [-(Q^(q + p/2) + Q^(q-p/2))z] . (6.68) 

This change of variables can be used to separately invert the transverse and longitudinal 
functional dependences of 5ji a since for fixed q, (6.66) defines a one-dimensional integral 
equation for S/i a (q, z) whose pseudoinverse solution can in principle be computed. We 
thus obtain a solution to the inverse problem in the form 

<Wr) = J ^e-*-* J dpK+(z;q,p)$ s (q + p/2,q-p/2) , (6.69) 
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where K + denotes the pseudoinverse of K. It is important to note the ill-posedness due 
to the exponential decay of the evanescent modes (3.59) for large z. Therefore, we expect 
that the resolution in the z direction will degrade with depth, but that sufficiently close 
to the surface the transverse resolution will be controlled by sampling. 

6.3. Single- scattering tomography 

Consider an experiment in which a narrow collimated beam is normally incident on 
a highly-scattering medium which has the shape of a slab. Suppose that the slab is 
sufficiently thin that the incident beam undergoes predominantly single-scattering and 
that the intensity of transmitted light is measured by an angularly-selective detector. If 
the detector is collinear with the incident beam and its aperture is set to collect photons 
traveling in the normal direction, then only unscattered photons will be measured. Now, 
if the aperture is set away from the normal direction, then the detector will not register 
any photons. However, if the detector is no longer collinear and only photons which exit 
the slab at a fixed angle are collected, then it is possible to selectively measure single- 
scattered photons. Note that the contribution of single-scattered photons is described 



The inverse problem for single-scattered light is to recover fj, a from measurements 
of 51 as given by (3.41), assuming /i s and p are known. The more general problem of 
simultaneously reconstructing (j, a , /i s and p can also be considered. In either case, what 
must be investigated is the inversion of the broken-ray Radon transform which is defined 
as follows. Let / be a sufficiently smooth function. The broken-ray Radon transform is 
defined by 



Here BR(ri, Si;r 2 , s 2 ) denotes the broken ray which begins at t*i, travels in the direction 
§i and ends at r 2 in the direction s 2 . Note that if r 1 ,r 2 ,s 1 and s 2 all lie in the same 
plane and §i and s 2 point into and out of the slab, then the point of intersection R is 
uniquely determined. Thus it will suffice to consider the inverse problem in the plane 
and to reconstruct the function / from two-dimensional slices. 

Evidently, the problem of inverting (6.70) is overdetermined. However, if the 
directions §i and s 2 are taken to be fixed, then the inverse problem is formally 
determined. To this end, we consider transmission measurements in a slab of width 
L and choose a coordinate system in which §i points in the z direction. The sources are 
chosen to be located on the line z = in the yz-plane and are taken to point in the z 
direction. The detectors are located on the line z = L and we assume that the angle 9 
between §i and s 2 is fixed. Under these conditions, it can be seen that the solution to 



by (3.41). 




(6.70) 
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(6.70) is given by 



cot (^j F(k, z) - i cot 2 (^j ke- icot ^ 2 ^ (6.71) 



x 

where 



[ Z e icot ^ 2 ^F(k,^ 
Jo 



(6.72) 



F(k,0 = lim / e i ^R b f(y 1 ,y 1 + y 2 )dy 1 (6.73) 

and Ay(£) = (£, — £) tan0. 

Eq. (6.71) is the inversion formula for the broken-ray Radon transform. We note 
that in contrast to x-ray computed tomography, it is unnecessary to collect projections 
along rays which are rotated about the sample. This considerably reduces the complexity 
of an imaging experiment. We also note that the presence of a derivative in (6.73), as in 
the Radon inversion formula, means that regularization is essential. Finally, by making 
use of measurements from multiple detector orientations it is possible to simultaneously 
reconstruct /i a and /i s . See [118] for further details. 



7. Numerical Inversion Methods 



The starting point for numerical inversion methods is the definition of a variational 
form whose minimum represents the solution. The most commonly used is a regularised 
weighted least squares form 

£{v,H*)) = \\\v-H*)\?^ + 

From the Bayesian point of view (7.1) represents the negative log of the posterior 
probability density function 

n(x\y) = exp(-£ (y, F(x))) = ir(y\x)ir(x) (7.2) 

where n(y\x) is the Gaussian (normal) probability density function of the distribution 
of the noise in the data with mean zero and covariance l~ e 

n(y\x) = n QOisc (y - r(x)) = Af(o, r e ) (7.3) 

and n(x) is the Gaussian (normal) prior probability density function of the distribution 
of x with mean zero and covariance Y x . Thus minimisation of (7.1) represents the 
maximum a posteriori (MAP) estimate of (7.2). 

The forms in (7.1) and (7.2) can be generalised in two main ways. Firstly the prior 
may be assumed not normal but of the form 



it(x) = exp(— W(x)) 



(7.4) 
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Secondly the noise need not be Gaussian distributed but could follow an alternative 
probability model such as Poisson. This consideration leads to alternative data fit 
Junctionals of which the commonest is the cross-entropy or KL divergence 



KL(y,F{x)) 



y log 



y 



-y + F{x) dy 



(7.5) 



From an optimisation point of view the minimisation of (7.1) is known as the 
regularised output least squares solution and can be achieved with classical methods if 
£(y, J-{x)) is convex. Non-convexity does not normally arise from the forward mappings 
but may arise if the prior is non-convex. An alternative formulation of the problem which 
has its origins in control theory is known as the "all-at-once" or "PDE-constrained 
method". In this formulation the solution of the forward problem is treated as a 
constraint 



minimise: 



(7.6) 



\y-MU\H-! + aV(x) 
subject to: (2.1) - (2.3) 

Several of the problems in section 4 are linear. In addition it is frequently the case 
that a linearised problem is considered even when the forward problem is non-linear. 
Suppose we assume a known linearisation point x and we consider the true solution to 
be a perturbation from this point x = xq + x 6 then we consider the minimisation 



arg min x « 



= arg imiij.j 



-|||/ - F{x G ) - T (x )x 



S\\2 



a-\\x 
2 



5\\2 



—\\v 
2 \\y 



F'(x )x 5 " 2 



+ a-\\x 5 \\* 1 
2 x 



(7.7) 
(7.8) 



where y 5 = y~J r (xo) is the change in measurement assumed to be linearly related to x s . 
Discretisation of this problem as discussed in section 2 leads to the form (2.24). In the 
next section we consider several generic linear solvers as applied in optical tomography. 



7.1. Linear Methods 

Consider the solution of a linear discrete problem 

y 5 = Ax 5 y s e R M , x 5 e R N , A G 



^ M x R N 



(7.9) 



The weighted least-squares term to be minimised is 



x^ = arg min^ 

= arg min^i 
We define factorisations 



y°-Ax s \\ 2 r - r+ai-Wx'W' 1 



S\\2 



(7.10) 



l t l = r 1 l t l = r 1 



(7.11) 
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and the canonical (dimensionless) variables 

y=l e y s A=L e AL; 1 x — L x x s (7.12) 



Then the solution to (7.10) is given by 
x* = arg min 



1 ~~ 1 
E a (y,Ax) : = Aa;|| 2 + a-||a;|| 2 



(7.13) 



In purely algebraic terms, the transformation A — > A given in (7.12) is a conditioning 
step, with L e , L" 1 the left and right preconditioners respectively [119]. 
We now consider some linear inversion methods for (7.13). 

7.1.1. Newton methods (overdetermined system) A Newton solution for an over 
determined system (M > N) is written 

x* = (A T A + al) 1 A T y (7.14) 

i x xi = (l^a^alj 1 + ai)- 1 Lj T A T r e -V 

xi = (A T r e - 1 A + arj 1 )- 1 A T r e -V (7.15) 

7.1.2. Newton methods (underdetermined) A Newton solution for an underdetermined 
system (M < N) is written 

x* =A T fAA T + «| N ) 'y (7.16) 



l x xi = L; T A T Lj(L e Ar :E A T Lj + al) 1 L e y s 

xt =r x A T (AV x A T + ar e y 1 y s (7.17) 

7.1.3. Landweber Method and Steepest Descent The steepest descent method is an 
iterative reconstruction scheme. Beginning with an arbitrary initial estimate x° (usually 
all zeros), we iterate solutions using 

x k+1 = x k + r k s k (7.18) 

where the steepest direction for (7.13) is given by 

s k = A T (y - A£fc) - ax k (7.19) 

and 

= nr^r^i „, » ( 7 - 20 ) 



is the step length in direction that minimises the one dimensional error function 
S(t) := £ a (y,A(x + rs k )) 
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The implication is that the vectors in the sequence {si, s 2 , . . .} have the properties [120] 
that they can be obtained iteratively 

s k+1 = s k -T k A T As k , (7.21) 

and satisfy an orthogonality condition 

(s k+ i,s k ) = 0. (7.22) 

The Landweber method replaces (7.18) with 

x k+ i = x k + rA T (y - Ax k ) (7.23) 



where r is a relaxation parameter. Rather than take the exact step that would move 
the solution to the minimum of the error function in the direction s k the Landweber 
method takes a fixed step. Thus its convergence is slower than the steepest descent 
method. 

Considering the form of (7.23) we have 

x S(k + l) = x 8(k) + r |- xA T|--l ( y 6 _ Ax 8(k)j (7>24) 

7.1. 4- Krylov Methods Consider the general, overdetermined quasi- Newton scheme, 

a'^H.-W, (7-25) 

or the underdetermined scheme 

X s = t^H^y 6 (7.26) 

where G R N x R N , H e G R M x R M are symmetric positive definite. Then the Krylov 
spaces may be defined as the spaces spanned by the set of vectors 

K x := span {AV, H x AV ■ ■ ■ H£AV • • •} (7.27) 
K e := span {AV, A T H e? / . . . A T H^ 5 , . . .} (7.28) 

Deriving these Krylov sequences for the problems (7.14), or (7.16) both give rise to the 
same set 

K a = |A T y,AWy + aA T y,...,^^'(A T A)^' ? A T y...| (7.29) 

Defining the unregularised Krylov space of dimension J, via the basis set 

K= {v U) } := |(A T Ay A T y, j = 0...J-lJ (7.30) 
we may derive the Krylov space for any regularisation parameter a via the basis set 

r «<°> = «(o) 

~(1) . ~(0) 

(7.31) 
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r,A T r-V 
(r^r^A + a) v 



(o) 



(7.32) 



Contrast this with the basis defined from (7.15) 



K (ON) = 



or that from (7.17) 



K (UN) 



(1) 



V, 



U) 



A T r e -V 

A T f~ 1 AA T r~ 1 y <5 + al"; w- V 



(A T Tj 1 A + af" 1 )' 7-1 A T \~~ 1 y s 



r.AV 

r x A T Ar x A V + aV x A T V e y 5 



(7.33) 



(7.34) 



In the conjugate gradient method a second sequence of A-conjugate directions 
{Pi)f>2, • • •} is constructed and (7.19) becomes 



Xk+l = X k + T k p k . 

The direction p k is given by 

pk+i = s k+1 + (3 k p k 
where the Gram-Schmidt constants f3 k are given by one of the methods 



Pk 



Fletcher-Reeves: ^~ + h 



Polak-Ribiere: p±l2^±iz£d 



(7.35) 
(7.36) 

(7.37) 



and the step length (c.f. (7.20)) is given by 



Tk 



\Ap k 



a Pfc 



(7.38) 



We note that given a Krylov vector (an image in parameter space) the next vector 
is formed by the steps 

forward project — > filter in data space — > back project — > filter in image space (7.39) 

Thus the sequence can be formed without building the matrix A and constitutes a matrix 
free approach. 
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7.1.5. Row and Column Normalisation Let us define af G M. N as the i th row of A 
and G R M as the j'th column. Then we can define a number of purely algebraic 
conditioning matrices 

Rx = diag[{||o?||i;i = l...M}] 

d = diag[{||af|| i; j = l...JV}] U ' 4Uj 

Rl = diag[{||o?||l;i = l...M}] = diag [AA T ] 

q = diag [{ I |of I |i; j = 1 . . . i\T}] = diag[A T A] ^ 

where the notation ||.|| p := (Yl .p) 1 ^ defines the p-norm of a vector. 

For the special case where A contains purely nonnegative elements the forms Ri, 
Ci can be constructed as row and columns sums: 

Ri = Ai , Ci = A T i (7.42) 

The choice 

r e «- Rx r; 1 «- d 

leads to the Simultaneous Algebraic Reconstruction Technique (SART) 

= ^(*) + rCr 1 A T Rr 1 {y s - Ax^ k) ) (7.43) 



X v y = CE 



This is suitable for emission tomography and also for fluorescence optical 
tomography where the sought for parameters are positive and the matrix A represents 
a probability of a photon emitted at voxel i arriving at detector j. 

The choice 



r; 1 - R2 r; 1 <- 1 

leads to the Simultaneous Iterative Reconstruction Technique (SIRT) 

x S(k+l) = x S(k) + rA T R -2 ( yS _ Ax 6(k)j (7>44) 

A particular preconditioner that is used in conjugate gradient solvers is given by 
(Golub and van Loan 1989) 

r; 1 = M = diag [A T r e _1 A] (7.45) 

which seeks to equalise the curvature of the objective function along the coordinate axes 
( "sphering" ) . 

The SIRT and SART algorithms derived from the Landweber method can also be 
thoguht of as derived from the Algebraic reconstruction technique (ART), or Kaczmarz 
method [121]. This is a row-action method that computes an update to the solution 
by processing one row of the linearised system at a time. The inner loop generates an 
update from row i by 

x 6(k + l) = x S(k) + w Vi k^X )_ p {7M) 

1 1 i 1 1 

where Wk is a relaxation parameter. The loop over rows of A is repeated n times. 
The rows % are processed in randomised order, which has been shown to improve the 
convergence rate of the method [122]. 
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7.2. Multiplicative Methods 
Consider the subsitution 

y = e c <-> C = log y 
An optimisation problem in terms of (7.47) may be defined 
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= arg miiij 



-|| logy - log JF^) +a ^(x) 



Taking the linearisation of this as in (7.7) leads to a problem 

— II,! |L> II — l()L> ( .In i — 
"1 



xl = arg mnijj 



logy — logjF(x ) — ^rr—r x5 \ lr-i + a^(x) 



arg mm,.. -||£ 5 - ix <5 ||^_i + a^(x) 
where ( 5 = logy — \ogj 7 (x ) and the linear operator 

with y = J-{xq). Similarly, considering 

x = e^ <-> £ = log a; 

leads to a linearised problem 
£f = arg min €<s 

= arg min^ 



-Hlogy-log^xo)- ^ 



^'^^X-i+a^x) 



with 



F'{x )x 



Vo 



Taking the discrete version we arrive at 

A = diag[l/ ? /o]L e AL; 1 diag[a ; o] 
which means that the covariances have been transformed : 

T e -> diag[y ]r e diag[?/o] , l~x -> diag[x ]r a; diag[a;o] 
The steepest descent scheme for (7.54) is given by 

y 



x k+ i = x k exp 



A log 



F(x k ) 



aV'(x k ) 



(7.47) 
(7.48) 

(7.49) 
(7.50) 

(7.51) 

(7.52) 

(7.53) 
(7.54) 

(7.55) 

(7.56) 
(7.57) 

(7.58) 
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when the mapping JF is linear (as in fluorescence DOT for example) and when 
regularisation is only taken implicitly, we arrive at the Multiplicative ART (MART) 
scheme 



x k+ i = x k exp 



A T log 



y" 



Ax i 



(7.59) 



If we start instead from the KL divergence 



KL(y,F(x)) = Y,Vlog 



y 



y + Fix) 



(7.60) 



we get to the Maximum Likelihood Expectation Maximisation (MLEM) algorithm 

y 



x k 

Xk+1 = A^i 



F(x h 



(7.61) 



In this case the explicit regularisation leads to the MAP-EM algorithm 



7.3. Non-Linear Methods 



A T i + a^'{x k ) 







y 



F(x k ) 



(7.62) 



7.3.1. Gauss-Newton Method The Gauss-Newton method can be considered as the 
iterative minimisation of the quadratic Taylor-series approximation to S(y,J r (x)) 

£{y s ,F(x k + x s )) ~ ±\\ y -F( Xk )-F'( Xk ) x s \\ 2 r _ 1+ 

a (V(x k ) + 2 (x s , Wx k ) + (x s , Wx 5 )) (7.63) 

where W(x k ) : X — > X represents the mapping induced by the linearisation of the 
functional Minimisation of (7.63) is given by the solution to 

(A T T^ 1 A + aW) x s = A T V~ 1 (y - F{x k )) - Wx k , (7.64) 
Hx s = -g (7.65) 

with Hessian H = (A T V' 1 A + aW) and gradient g = Wx k - A T r^ (y - F{x k )). From 
the form of the Hessian we can see it is guaranteed to be symmetric non-negative definite 
provided that \1/ is convex, and therefore x 6 is guaranteed to be in a descent direction 
for S. Solution of (7.65) can be carried out with any of the methods in section 7.1. 
In particular the Gauss-Newton-Krylov method uses a Krylov solver for (7.65). For a 
badly ill-posed problem the size of the requisite Krylov space can be quite small, and 
may be constructed entirely through forward and adjoint solutions and image and data 
filtering operations as in (7.39), and therefore also as a matrix-free approach. 

Since the approximation in (7.63) is only locally quadratic, the update given by 
solving (7.65) may i) not be optimal or may ii) not be a descent step (if the Hessian is 
not symmetric positive definite). There are two strategies for resolving these problems. 
The Damped Gauss-Newton method is a globalisation strategy that addresses the first 
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problem. In this approach the update direction is used in a one-dimensional line search 
to find an update step r k that minimises £ along this direction: 

r k = arg mm T £(y, T[x k + tx 5 )) (7.66) 

Note that the full non-linear mapping JF is used in (7.66), not its linearisation. The 
Levenberg-Marquardt method can also address the second problem. In this approach a 
control parameter 7 is used to modify the Hessian 

H -> H + 7I. 

When 7 is large the update x s tends towards the steepest descent direction with 
increasingly shorter steps. When 7 is small the update tends towards the Newton 
direction. The idea in the algorithm is to decrease 7 whenever the preceding step 
reduced £ but to increase it and re-solve (7.63) if the preceding step increased £ . The 
role of 7 also serves to modify the eigenspectrum of H and force it to be symmetric 
positive definite. The role of column normalisation of A (see section 7.1.5) is often 
crucial in the use of the Levenberg-Marquardt algorithm since it can "sphere" the level 
sets of the local quadratic approximation. Typically the Levenberg-Marquardt method 
is required for highly non-linear problems, which is not the case in optical tomography. 
A comparison of Levenberg-Marquardt and damped Gauss-Newton methods can be 
found in [123]. The Gauss- Newton method was used for the inverse RTE problem in 
optical tomography in [124] 

7.3.2. Nonlinear conjugate gradient method The algorithm is effectively the one in 
section 7.1.4 with line search as in (7.66) replacing the calculation of the step length 
give in (7.38). Rather than present it in the canonical variables we can present it in the 
normal variables (Fletcher-Reeves version) 
Set g = a*'(x ) - .F'Tj 1 (y - T{x G )) 

Po = -^xOo 

for k = 1, ...: do 

r k = arg rnin T £(cc fc _i + rp fc _i) 

x k = x k ^i + T k p k _i 

g k = ay(x k )-F'*r- 1 (y-F(x k )) 

a _ (9kFx9k) 
' (9k-ifxgk-i) 

Pk = -r ' x g k + p k p k -i 

end for 

Note that s k = —V x g k represents the application of the preconditioner M^ 1 = Y x 
to the gradient. If we assume ^'(x) = Wx = LjL^a; = Y^x then this conditioned 
gradient will be 

s k = Y X T'*X~ X (y - F(x k )) - ax k 

Note the similarity to the general scheme in (7.39). Also note that the variables s are 
no longer dimensionless. They have the reciprical dimensions of the original parameters 
themselves. 
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Non-linear Conjugate Gradients is usually restarted after a certain number of 
iterations, as for any descent method, may be combined with projection onto convex 
sets for enforcement of constraints such as lower and upper bounds, but at the expense 
of loss of its conjugacy properties. Nonlinear CG was used in optical tomography for 
the diffusion based problem in [125] and for the RTE based problem in [126] 

7.3.3. Limited-memory BFGS method The BFGS (Broyden-Fletcher-Goldfarb- 
Shanno) algorithm is a quasi-Newton approach that builds up estimates H _1 of the 
inverse of the Hessian matrix H [127,128]. The update rule 



x k+1 = x k - \ k Y\ k 1 g k (7.67) 

is employed, where is updated at each iteration by the formula 

H^^VjH-^ + P^, (7.68) 

p k = 1/ (z k ,d k ) , (7.69) 

V fe = I - Pk z k dJ, (7.70) 

d k = x k+1 - x k , (7.71) 

z k = g k+ i - g k (7.72) 



In the limited memory version of the algorithm (L-BFGS), the approximate matrices 
H^T 1 are not stored explicitly, but described implicitly by a limited number of pairs of 
vectors {dj, Zj}, where in each iteration, a new vector pair is added, and the oldest pair 
is discarded. 

7.3.4- Nonlinear Kaczmarz Method The non-linear Kaczmarz method is widely used 
in non-linear tomography [129]. It is applicable for problems such as DOT which use 
multiple sources. Using one source at a time, the subset of the data from this source is 
back-projected and added to the solution 

x k+ i = x k + CT^V' 1 (y i{ k) - Fi(k){x k )) (7.73) 

where the index i{k) refers to the subset of the data accessed on the k th iteration 
of the algorithm. The operator C is a simple operator playing the role of a right- 
preconditioner. A natural choice would be 

C=(F l{k) F': {k) + a\)~ l (7.74) 

but this may be hard to compute. Applications in optical tomography can be found 
in [49, 130] 

7.3.5. Iterative Coordinate Descent A relatively simple approach to generating step 
directions is to update one pixel at a time. It is equivalent to taking a descent direction 
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s, projecting to one unit axis, and iterating through the dimensions of X. This is Gibbs 
sampling procedure for the posterior distribution within the Bayesian framework. 

Writing e k for a unit with 1 in the k th entry and otherwise, the ICD method 
considers 

r k = arg min fc [£(x k + re k )] (7.75) 
x k +i = x k + r k e k (7.76) 

Taking a local linearisation around x k the one dimensional objective functional to 
be minimised is 

S(x k + re k ) = -\\y - ^(Xk)^-! - r (y - F(x k ), V^a^) + 

Vllc^H^! + a^(x k + re k ) (7.77) 
If the prior is Gaussian then (7.77) is a weighted least-squares problem with minimum 



1 p. 



but it is relatively simple also for a non-Gaussian prior since evaluation of the prior 
usually only involves neighbouring pixels. Projection onto constraint sets is also efficient 
since once the constraints are imposed for one pixel it is not revsited. However the 
evaluation of the likelihood is only efficient for an explicit matrix method and not for 
the matrix-free approach. 

The ICD method was used in PET by Fessler [131] and in Optical Tomography by 
Bouman and Webb [132,133]. Acceleration was achieved by a multiresolution strategy 
in [134]. 

7. 3. 6. PDE Constrained Method The PDE constrained method considers the approach 
presented in (7.7). To implement the method we consider Lagrangian (dual) fields Z 
and define an objective function 

Ji(x, Ui, Zi) = \\ yi - MUiWy + a*(x) + (Z h (C(x)U t - 9i )) n ( 7 - 79 ) 

where % represents an equivalent source for the boundary condition (2.2). When 
considering all sources the full Lagrangian is 

J(x, U,Z) = J2 U h Zi) (7.80) 

i 

The minimum of (7.79) occurs where the first variation 

Ji, x =a*'(x) + {Z i ,C x Ui)n (7.81) 
J itU = C*Z t - M*V~ 1 (y t - MUi) (7.82) 
J hZ = CU t - q t (7.83) 
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becomes zero. We recognise (7.82) as the equation for the backprojected field of the 
residual difference between the data and the measurement of the direct field C/j. 
We recognise C x = V(x 6 ) as the potential operator introduced in (2.18). Therefore if 
(7.83) and (7.83) are both equated to zero, (7.81) is the gradient of the negative log 
posterior of a MAP estimation scheme. If instead {— J^ x , —Jijj, —Ji,z} is used as an 
update direction for {x, U, Z} then we may solve for all variables simultaneously without 
requiring convergence for any one variable until termination. 

Taking the second variation of J results to a Newton system 



/ a$i"{x) 

c* x z, 

\ CxUi 



c x z t 

M*V- X M 
C 



CM,, \ 



C 




/ 



( 



\ 



a^'(x) + (Z l ,C x U l ) n 
C* Zi -M*V~ l {y-MUi) 
CUi-q 



(7.84) 



In many applications the system in (7.84) is simplified by taking the Schur 
complement of the complete Hessian and by using the Gauss-Newton or other quasi- 
Newton approximations for the solution scheme [52, 135, 136]. For the time-domain 
problem, even with such Hessian reduction techniques, Newton methods are infeasible 
but first-order descent schemes can still be used [33]. 

7.4- Error and Prior Modelling 

So far we did not discuss the forms of the data covariances l~ e or the prior ty(x). The 
format of noise usually be predicted on physical grounds. It is usually takem to be zero- 
mean Gaussian noise with a possibly non-white covariance. When considering photon 
counting measurements the implication is that the noise should be Poisson with variance 
cr| = i/j. Assuming sufficient signal to approach the Central Limit Theorem would lead 
to an equivalent Gaussian model of additive noise with 



T e = diag[y] 



L e = diag 



y 



1/2 



(7.85) 



A more commonly used model assumes that equal numbers of photons are collected at 
each detector leading to a constant relative error and the formal covariance structure 



diag[y 2 



diag 



(7.86) 



which also corresponds to the error model implicit in using the Rytov series in place of 
the Born series for the linearised model. Within this photon counting paradigm there 
is no correlation between errors on different detectors. However, when considering the 
actual errors between measured and modelled data the correlation is far from being 
negligible, and their mean is far from being zero. This discrepency can be understood 
by formally considering the modelling error as a random variable [137]. 



vr(y mcas -^(x truc ))~AT(e, r e + r e ) 



(7.87) 
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where Th is an approximate model with the accuracy of the computational method 
employed, e is the model bias representing the discrepency between this model and the 
real Physics, and V e is the model error covariance. In practical applications of optical 
tomography an estimate of the bias is made by measureing a reference problem and 
comparing it to a reference model 

e = y rci - F h (x rei ) (7.88) 

which leads to a corrected model 

^corrected (x) = Fh{x) + y"* ~ F h (Xrei) (7.89) 

Clearly the corrected model and the measured data agree exactly for the value x = x re f 
and the assumption is that they will also agree at nearby values. This is questionable. 

A more principled approach is the approximation error method [137]. In this 
approach the statistical properties of the modelling error are estimated by by sampling 
over a plausible distribution of solutions and comparing the modeled data from each 
sample with "measured" data from this sample. In practice the measured data could 
be also be modelled but with a much more accurate technique. In [138] this technique 
was shown to result in reconstructed images using a relatively inaccurate forward model 
that were of almost equal quality to those using a more accurate forward model; the 
increase in computational efficiency was an order of magnitude. 

Choice of regularisation \1> is critical and difficult to justify unequivocally. Assuming 
a form 

#(ar) = I ij;(\Vx\))dr, (7.90) 



Jn 

where ip : X — > X is an image to image mapping leads to the linearisation 

I"; 1 = W(x) = V T k(r)V (7.91) 
where the diffusivity is given by 

(7, 2 ) 

Note that the matrix V x is a covariances whose entries represent a correlation between 
pixels. On the other hand I - " 1 , is sparse, representing the local relationship between 
neighbours. We could therefore specify V x in a number of ways 

(i) From a database of representative images 

(ii) By specifying a Markov Random field T^ 1 = W 

(iii) By specifying a PDE r"" 1 = V T W 

Methods such as first-order Tikhonov (Phillips-Twomey), Total Variation (TV), or 
Generalised Gaussian Markov Random Field methods (GGMRF) [139] make statistical 
assumptions about the distribution of edges. They pose assumptions about the 
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regularity (e.g., smoothness) of the solution, the regularity being measured in terms of 
some norm of the solution and its differentials [140]. In addition the "lumpy background" 
prior provides an effective method of estimating information content of different imaging 
systems [141]. 

Image processing methods for computer vision offer a large set of techniques for 
de-noising and segmentation based on anisotropic diffusion processes [142-145]. These 
techniques formulate directly a PDE for image flow, rather than as the Euler equation of 
a variational form and can be considered more general. More specific prior information 
can be incorporated if we assume that some approximate knowledge of the objects such 
as shape topology and intra-region parameter regions is available. [146] 

8. Shape Based Methods 

Optical Tomography as discussed so far in this article is an example of a parameter 
identification problem because the reconstructed images represent the parameters of 
a model of light propagation. In common with many medical imaging modalities 
the reconstructed images are not an end in themselves; they need to be analyzed for 
structural and functional information, including classification of regions, segmentation 
and cross validation with other modalities. 

In the particular case of OT, this post-processing step has some drawbacks. In 
reconstruction, we use small sets of measurements to reconstruct a large image with 
many parameters. In the subsequent analysis stage we analyze these parameters to 
categorise the image into only a few discrete categories e.g. to estimate a classification 
into discrete regions, or to determine the parameters of a low-dimensional shape 
model. In each case, the final result is of considerably smaller dimensionality than 
the intermediate reconstruction. An alternative approach would be to integrate the 
classification or segmentation with the reconstruction. This problem may be better 
posed than the two-stage approach, as we are only moving from the sparse measurements 
to another low-dimensional space. 

In many biomedical applications, the parameter distribution which is sought in 
the inverse problem, contains some sorts of interfaces. These can for example be 
boundaries of some tumor or hematoma, or the interfaces between different organs or 
tissue types, or the boundaries of regions filled with some tracer or marker substance. 
These interfaces are typically not recovered well in classical reconstruction schemes due 
to the above mentioned need for relatively strong regularization tools. Most of these 
regularization tools penalize variations or gradients in the parameter distribution, which 
yields oversmoothed reconstructions. Therefore, interfaces and boundaries are blurred 
and region boundaries cannot easily be detected. However, in many applications it 
is important to be able to find the boundaries of certain inclusions as accurately as 
possible. Whereas inside and outside these subregions the parameters might not vary 
much, often across the interfaces significant jumps occur in the tissue parameters. This 
motivates the possibility of direct shape-based reconstruction methods, which generally 
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can be categorised into explicit and implicit methods. 

Shape reconstruction techniques for Optical Tomography using the DA as the 
forward model are presented in [104, 146-154] and using the RTE for the forward model 
in [155]. 

8.1. Explicit Shape Method 

In shape based methods it is assumed that the domain is represented as in (5. 9), (5. 10). 
The assumption that the optical parameters are constant in subdomains is usually also 
imposed, but can in principle be relaxed. In the explicit shape method we represent the 
boundaries by an explicit parameterisation and construct a forward problem in terms of 
the coefficients of the parameterisation and possibly the optical parameters inside each 
domain. 

8.1.1. Parametric representation of surfaces A typical parametisation of a closed 
surface is in terms trigonometric functions (2D) or spherical harmonics (3D) 



where a purely real basis is constructed from the real and imaginary parts of the complex 
functions. 

These bases may be used to parameterise the radial distance of the surface from an 
internal point, restricting the analysis to star-shaped objects, or to develop a a harmonic 
mapping of the surface onto a circle or sphere, which is made possible by the specific 
representation of each cartesian component seperately in the basis. 

In general, the higher order basis functions are roughly assumed to represent more 
detailed characteristics of the surface, whereas the lower order ones describe more the 
overall features like volume, orientation etc. 

For simplicity we introduce the notation 



to describes the finite set of basis coefficients for the surface £ up to degree K. 

8.1.2. The Forward Problem As well as the set of surface coefficients {7aJ we define the 
set of parameters {xg} for each domain Qg. We now define the nonlinear forward operator 
as the mapping from the optical properties x = {xg}£ — 1,...L of the individual 
regions Qg, £ = 1,...L, and the geometric parameters S = {%} k — 1, . . . , K to 
the measurements on the surface of dQ. Combining the measurements from the S 
independent sources, the forward mapping takes the form 




e M in 2D 

Y l m (d,<p),k = (l + l) 2 + m in 3D 



(8.1) 



S = { lk }, k = l,---K 




y = F(S,x) 



(8.3) 
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8.1.3. The inverse problem The most general inverse problem to be considered in our 
framework would be the simultaneous reconstruction of shape parameters 7 and optical 
parameters x from the data y mca - s . However, in most practical applications which we 
have in mind, good estimates for the optical parameters in most of the regions fij are 
available, and only those optical parameters which correspond to unknown anomalies 
need to be recovered. In particular the outer boundary shape <9f2 is usually considered 
known as well as the optical parameters in the outermost region. Relaxation of these 
assumptions, in particular the first, leads to more complex considerations that have 
so far not been addressed in optical tomography, but for which methods have been 
developed in related fields [156, 157] 

As in the parameter identification problem, the key aspect of using the shape based 
method is to discover a formula for calculating shape sensitivities functions which form 
the kernel of the Frechet derivative of (8.3). Roughly speaking, the shape sensitivity 
function for shape basis function 6 fc is given by the spatial sensitivity function p defined 
in (2.21), projected onto bk on E 



The precise form of the sensitivity function depends on the forward model. In [146] a 
FEM model was used and in [154] a BEM model. In the latter the model is naturally 
represented as the union of subdomains as in (5.9) and the domain boundaries are 
explictly meshed. The fields $ and the normal currents J n = v ■ V$ are explicitly 
represented, and sensitivity function p is defined in terms of both the field and the 
normal currents. Shape updating requires only movement of the mesh points of the 
internal surfaces. By contrast, in the FEM model the surface £ intersects to a subset of 
elements of the mesh. The appropriate form of (8.4) calls for the integration of products 
of forward and adjoint fields along split elements. If the simplified BEM model is used, 
where the normal currents are removed by taking the Schur complement of the discrete 
system(5.21), then the sensitivity functions form FEM and BEM are the same. 

To develop a shape reconstruction scheme, starting from a geometric configuration 
defined by the initial set of shape coefficients S^°\ we will search for the set S that 
minimises the distance between computed ^(7, x) and measured data y meas 



In [146-148, 154] a Levenberg-Marquardt scheme was used for the inversion step 



Vj,i,k — (Pj, 




(8.4) 



S = argmin 5 S(S) := ^\\y mcas - F(S, x) || 2 



(8.5) 



5 (n+l) = gin) + (A T An + A |)-l A T (y 



nicas 



^ (B) ,x))- 



(8.6) 



8.2. Implicit Shape Method 



The level set technique represents the boundary of a domain as the zero-set of a function 
ip. Rather than updating the boundary of the domain an update rule for p is developed 
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and the optical parameters are defined 

x=[ Xb * IP ~° (8.7) 
[ x cxt tp>0 

providing a tool which is able to reconstruct interfaces in the region of interest 
together with certain characteristics of the interior and exterior subregions. The key 
difference from the explicit methods is that topological changes in the domains are 
easily accomodated. A description of this approach for the DA with examples in 2D is 
provided in [153] and the 3D case in [158] . Level set methods based on the RTE model in 
optical tomography were presented in [151, 152]. In these papers the scattering function 
/j, s was assumed known a priori and only the subdomain boundaries of the absorption 
function /i a were reconstructed. So far, these methods have not yet been applied to the 
reconstruction of the low scattering void regions. 

Due to the presence of two parameters /i a and D in the DA, the application 
to Optical Tomography uses two level set function tp^,tp D . Assuming that the 
background absorption and diffusion parameters fi ext and Axt are constant and known, 
the unknowns of the inverse problem are therefore p^^tfE, and the interior absorption 
and diffusion parameters jj,\ nt and Ant- The forward mapping is defined 

y = T (D(lf D , Ant), ^a(¥Va» Mint)) (8-8) 

and we formulate the output least squares cost functional 

£(<PD,<P^, Ant, Mint) = ^||z/ meaS -^(^D,Ant),//a(^ a ,Mint))|| 2 - (8.9) 

The goal is to find a mimimizer (tp, if), D obj , fj, ) of this cost functional. 

For solving the inverse problem an evolution approach is adopted in which the 
unknowns of the inverse problem are assumed to depend on an artificial evolution time 
t and evolve during the reconstruction 

9J t = MO, = KM (8 U) 

where fo, /^ a , hp, /i Ma are forcing terms which point into a descent direction of the cost 
(8.9). We use £ :— (p — to denote the zero set of <p and 

S = .F'*(y meas - J= (D((p D , Ant), /ia(W, Mint)) (8.11) 

for the "classical" descent direction. Then the forcing terms are defined 

fx=W- 1 (x ink -x ext )s x (8.12) 

h x = (x int - x cxt )s x | s (8.13) 

where W is a regularisation term as in section 7.3. Note that the update for the level set 
tp is defined on the whole domain, whereas the update for the parameters is only defined 
on the zero set interface. In practice the level set update may be resticted to a narrow 
band within a specified distance of this interface. For details we refer to [153, 158]. 

The level set approach can easily be generalized to more complex scenarios such as 
smoothly varying or parameterized interior and exterior parameter profiles. 
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9. Conclusions and Further Topics 

Before drawing this review to a conclusion we briefly mention some important topics 
which space precludes us from developing in further detail. 

9.1. Anisotropy 

As mention in section 3.5, if we relax the assumption that the RTE phase function is 
angularly invariant, we can develop an anisotropic model. The derivation of this model 
in terms of a spherical harmonic expansion (i.e. the P/v method) leads to a diffusion 
equation with a tensor diffusion coefficient [159, 160] 



In terms of the inverse problem we are lead to similar difficulties with regard to 
non-uniquness as occur in impedence imaging. In particular a diffeomorphism of Q 
that preserves the Robin-to-Neumann map will be equivalent to a change in the local 
orientation of the eigenvectors of K. As a mechanism for compensating for such non- 
uniqueness, the modelling error approach (see section 7.4) has been employed, [161,162]. 

9.2. Refractive Index Variation 

All the inverse problems we have discussed in this article were based on the transport 
equation or diffusion approximation with constant speed (i.e. constant refractive index). 
However as discussed in section 3.5, there exist a number of models for variable refractive 
index including (3.28) and (3.79). In addition there have been several treatments of 
the radiative transport equation in piecewise homogeneous (i.e. layered) domains with 
constant absorption and scattering but differing refractive index [163-166] wherein semi- 
analytic methods can be derived in terms of Fresnel boundary conditions at the interfaces 
between sub-domains. 

In terms of an inverse problem [167,168] has considered the diffusion approximation 
version of the problem as in (3.79) as the basis for an inversion to recover refractive index 
changes. 

9.3. Time-varying Optical Tomography 

In real biological applications, the subject being studied is not static but time varying. 
In this case we have the choice to reconstruct time-point by time-point or to consider 
a spatial temporal model. There are two main considerations. Firstly the time scale on 
which x is changing may be comparable to the time scale on which the experiments are 
performed. In other words the data acquisition process may be corrupted by movement. 
In this case there are methods from time-series analysis in which the dynamic changes 
in x are included in the reconstruction model. A well known example is the Kalman 
filter method which assumes a random walk model in some state space variables whose 




(9.1) 
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statistics is learnt over the course of time-series experiment. This idea was explored in 
diffuse optical tomography by [169] . In [170] the Kalman filter approach was extended 
to utilise a state space model in which the underlying temporal variation was taken 
to be a mixture of pseudo-sinusoidal signals with random amplitude and phase, which 
allowed seperable reconstruction of functional components in brain images. In [171] a 
fully space-time model was used with data reduction acheived via a Kronecker product 
representation of forward operators. 

The second consideration it that physiological models may be used to predict 
temporal variation which may be used to constrain the reconstruction. These methods 
used coupled systems of non-linear equations which are much more complex than those 
used in a state-space model [172]. A review can be found in [173] 

9.4- Multimodality 

By Multimodality imaging we mean the combination of the varying capacities of 
two or more medical imaging technologies in extracting physiological and anatomical 
information of organisms in a complementary fashion, in order to enable the consistent 
retrieval of accurate and content rich biological information [174-184]. In optical 
tomography, the aim is to enhance the low resolutionfunctional image information with 
high resolution complementary structural information using for example ultrasound 
[185,186], MRI [187,188], or X-rays [189,190]. 

One method for utilising the prior information is to construct priors that constrain 
the reconstruction of the optical information to be commensurate with the auxiliary 
information in a well-defined way. One approach is the use of structural priors which 
smooth the solution within regions but not across boundaries of regions inferred from 
the axiliary modality [191-193]. An alternative is to define an information theoretic 
measure of similarity between the reconstructed image and the prior image [194]. 

The topic of multimodality and the appropriate use of cross- information is rapidly 
expanding within optical tomography in particular and medical imaging in general. 
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